summaryrefslogtreecommitdiff
path: root/src/target/xscale.c
Commit message (Collapse)AuthorAgeFilesLines
* types: write memory now uses constØyvind Harboe2011-04-011-4/+4
| | | | Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* xscale: fix gaffe in phys writeØyvind Harboe2011-03-311-1/+1
| | | | | | | it would *read* instead of *write* to memory when the MMU was disabled. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* error handling: the error number is not part of the user interfaceØyvind Harboe2011-01-031-2/+2
| | | | | | | | Do not propagate error number to user. This is for internal programming purposes only. Error messages to the user is reported as text via LOG_ERROR(). Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* xscale: trace buffer remains enabled until explicitly disabledMike Dunn2010-12-041-51/+87
| | | | | | | | | | | | | | | | | | | | | Hi everyone, Since a call went out for patches... been sitting on this for months. For some reason, the xscale trace buffer is automatically disabled as soon as a break occurs and the trace data is collected. This patch was a result of the frustration of always re-enabling it, or else hitting a breakpoint and checking the trace data, only to discover that I forgot to re-enable it before resuming. Don't see why it should work this way. There is no run-time penalty, AFAIK. Along the way, I also cleaned up a little by removing the ugly practice of recording wrap mode by setting the fill count variable to "-1", replacing it with an enum that records the trace mode. I've been using this for months. Comments, criticisms gratefully received. Mike Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
* warnings: fix alignment warningsØyvind Harboe2010-09-201-1/+1
| | | | | | | These warnings are for architectures that do not support non-aligned word access. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* xscale: check that wp length does not exceed addressMike Dunn2010-09-201-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi everyone, A while back I sent in a patch that adds support for watchpoint lengths greater than four on xscale. It's been working well, until the other day, when it caused an unexpected debug exception. Looking into this I realized there is a case where it breaks: when the length arg is greater than the base address. This is a consequence of the way the hardware works. Don't see a work-around, so I added code to xscale_add_watchpoint() to check for and disallow this combination. Some more detail... xscale watchpoint hardware does not support a length directly. Instead, a mask value can be specified (not to be confused with the optional mask arg to the wp command, which xscale does not support). Any bits set in the mask are ignored when the watchpoint hardware compares the access address to the watchpoint address. So as long as the length is a power of two, setting the mask to length-1 effectively specifies the length. Or so I thought, until I realized that if the length exceeds the base address, *all* bits of the base address are ignored by the comaparator, and the watchpoint range effectively becomes 0 .. length. Questions, comments, criticisms gratefully received. Thanks, Mike Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
* xscale: bp/wp: additional LOG_ERROR on failureMike Dunn2010-09-201-6/+10
| | | | | | | | | | | | | | | | | | | | | Hi everyone, Added more LOG_ERROR messsages to watchpoint and breakpoint code, given that the infrastructure no longer interprets returned error codes. Also changed existing LOG_INFO and LOG_WARNING to LOG_ERROR for cases where an error is returned. Note that the check of the target state is superflous, since the infrastruture code currently checks this before calling target code. Is this being reconsidered as well? Also, should we stop returning anything other than ERROR_OK and ERROR_FAIL? Comments gratefully received. Thanks, Mike Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
* xscale: fix sw breakpoints for thumb; set bp immediatelyMike Dunn2010-09-131-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi everyone, Version 2 of this patch. Code added to breakpoints.c was removed from previous patch, and item 3 added, per discussion with Øyvind regarding error reporting. Item 4 added, which I just noticed. I tried to use a software breakpoint in thumb code on the xscale for the first time recently, and was surprised to find that it didn't work. The result was this patch, which does four things: 1): fix trivial cut-n-paste error that caused thumb breakpoints to not work 2): call xscale_set_breakpoint() from xscale_add_breakpoint() 3): log error on data abort in xscale_write_memory() 4): fixed incorrect error code returned by xscale_set_breakpoint() when no breakpoint register is available; added comment Item 2 not only makes the xscale breakpoint code consistent with other targets, but also alerts the user immediately if an error occurs when writing the breakpoint instruction to target memory (previously, xscale_set_breakpoint() was not called until execution resumed). Also, calling xscale_breakpoint_set() as part of the call chain starting with handle_bp_command() and propagating the return status back up the chain avoids the situation where OpenOCD "thinks" the breakpoint is set when in reality an error ocurred. Item 3 provides a helpful message for a common reason for failure to set sw breakpoint. This was thoroughly tested, mindful of the fact that breakpoint management is somewhat dicey during single-stepping. Comments and criticisms of course gratefully received. Mike Signed-off-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* xscale: mark xscale registers invalid on debug entryMike Dunn2010-09-091-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Hi everyone, This simple patch fixes a problem I noticed on the xscale where incorrect values are sometimes reported by the reg command. The problem can occur when requesting the value of registers in the xscale-specific register cache. With a couple of exceptions, none of the registers in the xscale register cache are automatically retrieved on debug entry. This is probably fine, as they are unlikely to be needed on a regular basis during a typical debug session, and they can be retrieved when explicitly requested by name using the reg command. The problem is that once this is done, the register remains marked as valid for the remainder of the OpenOCD session, and the reg command will henceforth always report the same value because it is obtained from the cache and is never again retrieved from the debug handler on the target. The fix is to mark all registers in the xscale register cache as invalid on debug entry (before the two exceptions are retrieved), thus forcing retrieval (when requested) from the target across resumptions in execution, and avoiding the reporting of stale values. Small addition change by Øyvind: change 'i' to unsigned to fix compiler warning for xscale_debug_entry() fn. Signed-off-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* debug: use assert's when approperiateØyvind Harboe2010-08-111-2/+1
| | | | | | error was returned instead of using assert. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* arm: add error propagation for enable/disable mmu cachesØyvind Harboe2010-07-191-12/+32
| | | | Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* arm: add error propagation to generic get_ttb fnØyvind Harboe2010-07-191-3/+8
| | | | Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* TARGET: removed unused parametersAntonio Borneo2010-06-121-3/+1
| | | | | | | Parameters "domain" and "ap" of function armv4_5_mmu_translate_va() are not used. Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
* TARGET: removed unsed parameterAntonio Borneo2010-06-121-2/+2
| | | | | | | | Parameter "type" of function armv4_5_mmu_translate_va() is now not used. Remove the parameter and the "enum" listing its values. Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
* TARGET: fix handling return code of MMU translationAntonio Borneo2010-06-121-4/+0
| | | | | | | | | Function armv4_5_mmu_translate_va() now properly signals errors in the return value. Remove former error handling by setting variable "type" to value "-1". Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
* arm mmu: error propagation added for address translationØyvind Harboe2010-06-111-1/+4
| | | | | | | The return value for MMU translation was a mess, either error or value. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* xscale: add support for length arg to wp commandMike Dunn2010-04-241-8/+51
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the length argument to the xscale implementation of the wp command. Per discussion with David, the length argument specifies the range of addresses over which a memory access should generate a debug exception. This patch utilizes the "mask" feature of the xscale debug hardware to implement the correct functionality of the length argument. Some limitations imposed by the hardware are: - The length must be a power of two, with a minumum of 4. - Two data breakpoint registers are available, allowing for two watchpoints. However, if the length of a watchpoint is greater than four, both registers are used (the second for a mask value), limiting the number of watchpoints to one. This patch also removes a useless call to xscale_get_reg(dbcon) in xscale_set_watchpoint() (value had already been read from the register cache, and the same previously read value is then modified and written back). I have been using and testing this patch for a couple days. Questions, corrections, criticisms of course gratefully received.
* xscale: fix analyze_trace for trace data collected in wrap modeMike Dunn2010-04-151-214/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the xscale_analyze_trace() function. This function was defective for a trace collected in 'fill' mode (hiccups with repeated instructions) and completely broken when buffer overflowed in 'wrap' mode. The reason for the latter case is that the checkpoint registers were interpreted incorrectly when two checkpoints are present in the trace (which will be true in 'wrap' mode once the buffer fills). In this case, checkpoint1 register will contain the older entry, and checkpoint0 the newer. The original code assumed the opposite. I eventually gave up trying to understand all the logic of the function, and rewrote it. I think it's much cleaner and understandable now. I have been using and testing this for a few weeks now. I'm confident it hasn't regressed in any way. Also added capability to handle (as best as possible) the case where an instruction can not be read from the loaded trace image; e.g., partial image. This was a 'TODO' comment in the original xscale_analyze_trace(). Outside of xcsale_analyze_trace(), these (related) changes were made: - Remove pc_ok and current_pc elements from struct xscale_trace. These elements and associated logic are useless clutter because the very first entry placed in the trace buffer is always an indirect jump to the address at which execution resumed. This type of trace entry includes the literal address in the trace buffer, so the initial address of the trace is immediately determined from the trace buffer contents and does not need to be recorded when trace is enabled. - Added num_checkpoints to struct xscale_trace_data, which is necessary in order to correctly interpret the checkpoint register contents. - In xscale_read_trace() - Fix potential array out-of-bounds condition. - Eliminate partial address entries when parsing trace (can occur in wrap mode). - Count and record number of checkpoints in trace. - Added small, inlined utility function xscale_display_instruction() to help make the code more concise and clear. TODO: - Save processor state (arm or thumb) in struct xscale_trace when trace is enabled so that trace can be analyzed correctly (currently assumes arm mode). - Add element to struct xscale_trace that records (when trace is enabled) whether vector table is relocated high (to 0xffff0000) or not, so that a branch to an exception vector is traced correctly (curently assumes vectors at 0x0).
* xscale: fix trace buffer functionality when resuming from a breakpointMike Dunn2010-03-291-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: halt at a breakpoint, enable trace buffer ('xscale trace_buffer enable fill'), then resume. Wait for debug exception when trace buffer fills (if not sooner due to another breakpoint, vector catch, etc). Instead, never halts. When halted explicitly from OpenOCD and trace buffer dumped, it contains only one entry; a branch to the address of the original breakpoint. If the above steps are repeated, except that the breakpoint is removed before resuming, the trace buffer fills and the debug exception is generated, as expected. Cause: related to how a breakpoint is stepped over on resume. The breakpoint is temporarily removed, and a hardware breakpoint is set on the next instruction that will execute. xscale_debug_entry() is called when that breakpoint hits. This function checks if the trace buffer is enabled, and if so reads the trace buffer from the target and then disables the trace (unless multiple trace buffers are specified by the user when trace is enabled). Thus you only trace one instruction before it is disabled. Solution: kind of a hack on top of a hack, but it's simple. Anything better would involve some refactoring. This has been tested and trace now works as intended, except that the very first instruction is not part of the trace when resuming from a breakpoint. TODO: still many issues with trace: doesn't work during single-stepping (trace buffer is flushed each step), 'xscale analyze_trace' works only marginally for a trace captured in 'fill' mode, and not at all for a trace captured in 'wrap' mode. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* fix software breakpoints on xscaleMike Dunn2010-03-221-2/+12
| | | | | | | | | | | | | | | | | | | | | | This patch fixes xscale software breakpoints by cleaning the dcache and invalidating the icache after the bkpt instruction is inserted or removed. The icache operation is necessary in order to flush the fetch buffers, even if the icache is disabled (see section 4.2.7 of the xscale core developer's manual). The dcache is presumed to be enabled; no harm done if not. The dcache is also invalidated after cleaning in order to safeguard against a future load of invalid data, in the event that cache_clean_address points to memory that is valid and in use. Also corrected a confusing typo I noticed in a comment. TODO (or not TODO...?): the xscale's 2K "mini dcache" is not cleaned. This cache is not used unless the 'X' bit in the page table entry is set. This is a proprietary xscale extension to the ARM architecture. If a target's OS or executive makes use of this for memory regions holding code, the breakpoint problem will persist. Flushing the mini dcache requires that 2K of valid cacheable memory (mapped with 'X' bit set) be designated by the user for this purpose. The debug handler that gets downloaded to the target will also need to be extended.
* jtag: move towards making out_value constØyvind Harboe2010-03-201-1/+1
| | | | | | | These were relatively straightforward fixes which are backwards compatible. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* jtag: retire jtag_get/set_end_state()Øyvind Harboe2010-03-191-14/+0
| | | | | | | Voila! This get rids of mysteries about what what state the TAP is in. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* jtag: get rid of unecessary jtag_get_end_state()Øyvind Harboe2010-03-191-12/+21
| | | | | | By code inspection. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* Fix underlying problem with xscale icache and dcache commandsMike Dunn2010-03-181-6/+11
| | | | | | | | | | | | | | | | | | | | Fix problem with the xscale icache and dcache commands. Both commands were enabling or disabling the mmu, not the caches I didn't look any further after my earlier patch fixed the trivial problem with command argument parsing. Turns out the underlying code was broken. The resolution is straightforward when you look at the arguments to xscale_enable_mmu_caches() and xscale_disable_mmu_caches(). I finally took a deeper look after dumping the cp15 control register (XSCALE_CTRL) and seeing that the cache bits weren't changing, but the mmu bit was (which caused all manner of grief, as you can imagine). This has been tested and works OK now. src/target/xscale.c | 17 +++++++++++------ 1 files changed, 11 insertions(+), 6 deletions(-) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* jtag: cut down on usage of unintended modification of global end stateØyvind Harboe2010-03-181-14/+14
| | | | | | | | | | | | | jtag_get/set_end_state() is now deprecated. There were lots of places in the code where the end state was unintentionally modified. The big Q is whether there were any places where the intention was to modify the end state. 0.5 is a long way off, so we'll get a fair amount of testing. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* image loading: fix problem with offsets > 0x80000000Bradey Honsinger2010-03-161-1/+1
| | | | | | | | | | | | | | | Fixes bug that prevented users from specifying a base address of 0x80000000 or higher in image commands (flash write_image, etm image, xscale trace_image). image.base_address is an offset from the start address contained in the image file (if there is one), or from 0 (for binary files). As a signed 32-bit int, it couldn't be greater than 0x7fffffff, which is a problem when trying to write a binary file to flash above that address. Changing it to a 64-bit long long keeps it as a signed offset, but allows it to cover the entire 32-bit address space. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* fix xscale icache and dcache commandsMike Dunn2010-03-111-3/+3
| | | | | | | | | Simple patch that fixes the broken xscale icache and dcache commands. This broke when the helper functions and macros were changed. [ dbrownell@users.sourceforge.net: don't use strcasecmp ] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* jtag: jtag_add_ir_scan() now takes a single fieldØyvind Harboe2010-03-081-1/+1
| | | | | | | | In the code a single field was all that was ever used. Makes jtag_add_ir_scan() simpler and leaves more complicated stuff to jtag_add_plain_ir_scan(). Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* jtag: retire tap fieldØyvind Harboe2010-03-081-31/+11
| | | | | | | | | | | | | | | | | jtag_add_dr/ir_scan() now takes the tap as the first argument, rather than for each of the fields passed in. The code never exercised the path where there was more than one tap being scanned, who knows if it even worked. This simplifies the implementation and reduces clutter in the calling code. use jtag_add_ir/dr_plain_scan() for more fancy situations. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
* ARM: keep a handle to the PCDavid Brownell2010-02-211-27/+39
| | | | | | | | | | | | | | | | Keep a handle to the PC in "struct arm", and use it. This register is used a fair amount, so this is a net minor code shrink (other than some line length fixes), but mostly it's to make things more readable. For XScale, fix a dodgy sequence while stepping. It was initializing a variable to a non-NULL value, then updating it to handle the step-over-active-breakpoint case, and then later testing for non-NULL to see if it should reverse that step-over-active logic. It should have done like ARM7/ARM9 does: init to NULL. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* XScale: help/usage updatesDavid Brownell2010-01-071-36/+43
| | | | | | | | | | | | | | Provide helptext which was sometimes missing; update some of it to be more accurate (mostly they display something w/no args). Usage syntax messages have the same EBNF as the User's Guide. In some cases, *exactly* what the user's guide shows... e.g. talking about "offset" not "address" for trace_image. Don't use "&function"; functions are like arrays, their name is their address. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: add comment re register exportsDavid Brownell2009-12-261-0/+1
| | | | Modern versions of GDB can understand VFP3 and iwMMXt hardware.
* XScale: better {read,write}_phys()David Brownell2009-12-181-2/+14
| | | | | | | | We can actually do the right thing if the MMU is off; save the error message for the phys-but-MMU-enabled path, which is what isn't yet supported. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* XScale: use all-ones for BYPASS, not five-onesDavid Brownell2009-12-151-1/+1
| | | | | | PXA3xx has more than five bits in IR. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: list number of HW breakpoints/watchpointsDavid Brownell2009-12-071-0/+3
| | | | | | | | | | | | | | | When starting up, say how many hardware breakpoints and watchpoints are available on various targets. This makes it easier to tell GDB how many of those resources exist. Its remote protocol currently has no way to ask OpenOCD for that information, so it must configured by hand (or not at all). Update the docs to mention this; remove obsolete "don't do this" info. Presentation of GDB setup information is still a mess, but at least it calls out the three components that need setup. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: don't clone arm_arch_state() codeDavid Brownell2009-12-071-9/+2
| | | | | | | | | | | Have various ARM cores delegate to arm_arch_state() to display basic information, instead of duplicating that logic. This shrinks the code, makes them all report when semihosting is active, and highlights which data are specific to this core. (Like ARM720 not having separate instruction and data caches.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: use <target/arm.h> not armv4_5.hDavid Brownell2009-12-071-0/+1
| | | | | | | | | | | | | | Move most declarations in <target/armv4_5.h> to <target/arm.h> and update users. What's left in the older file is stuff that I think should be removed ... the old register cache access stuff, which makes it awkward to support microcontroller profile (Cortex-M) cores. The armv4_5_run_algorithm() declaration was moved too, even though it's not yet as generic as it probably ought to be. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename some generic routinesDavid Brownell2009-12-071-2/+2
| | | | | | | | | | | | | | | | Rename some (mostly) generic ARM functions: armv4_5_arch_state() --> arm_arch_state() armv4_5_get_gdb_reg_list() --> arm_get_gdb_reg_list() armv4_5_init_arch_info() --> arm_init_arch_info() Cores using the microcontroller profile may want a different arch_state() routine though. (Also fix strange indentation in arm_arch_state: use tabs only! And update a call to it, removing assignment-in-conditional.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: move opcode macros to <target/arm_opcodes.h>David Brownell2009-12-071-0/+1
| | | | | | | | | | | | | | | Move the ARM opcode macros from <target/armv4_5.h>, and a few Thumb2 ones from <target/armv7m.h>, to more appropriate homes in a new <target/arm_opcodes.h> file. Removed duplicate opcodes from that v7m/Thumb2 set. Protected a few macro argument references by adding missing parentheses. Tightening up some of the line lengths turned up a curious artifact: the macros for the Thumb opcodes are all 32 bits wide, not 16 bits. There's currently no explanation for why it's done that way... Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename armv4_5_build_reg_cache() as arm_*()David Brownell2009-12-041-1/+1
| | | | Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: ARMV4_5_COMMON_MAGIC --> ARM_COMMON_MAGICDavid Brownell2009-12-041-1/+1
| | | | Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: switch target_to_armv4_5() to target_to_arm()David Brownell2009-12-041-3/+3
| | | | | | And remove that old symbol. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename armv4_5_state_* as arm_state_*David Brownell2009-12-041-1/+1
| | | | | | And make arm_state_strings[] be const. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename armv4_5_mode_* AS arm_mode_*David Brownell2009-12-041-4/+4
| | | | Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename ARMV4_5_MODE_* as ARM_MODE_*David Brownell2009-12-041-5/+5
| | | | Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* ARM: rename ARMV4_5_STATE_* as ARM_STATE_*David Brownell2009-12-041-5/+5
| | | | Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* change #include "time_support.h" to <helper/time_support.h>Zachary T Welch2009-12-031-1/+1
| | | | | | | | | | | | Changes from the flat namespace to heirarchical one. Instead of writing: #include "time_support.h" the following form should be used. #include <helper/time_support.h> The exception is from .c files in the same directory.
* XScale: restore_context() cleanupDavid Brownell2009-11-301-38/+30
| | | | | | | | | | | Clean up two aspects to this routine: bad naming, since it doesn't restore the context, just the banked registers; and excess indentation for the bulk of the code. Also make some of its call sites stash the function's return code; someday they should use it for error checking. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* XScale: context restore, cleanup/bugfixDavid Brownell2009-11-301-15/+29
| | | | | | | | | | | | | | This "loop over all registers" routine shared the same mess as full_context() in terms of dozens of needless number_to_mode() calls. Fix that, and comments, with related cleanup. The misnamed xscale_restore_context() had a related bug. It was restoring the *WRONG REGISTERS* ... always from whatever the current mode was, instead of using the copy from whichever register bank it was trying to restore. (But it marked the intended register as having been restored...) Fixed that. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
* XScale: clean up full_context() (#2)David Brownell2009-11-291-39/+30
| | | | | | | Streamline the loop by continuing as soon as we know there's no work to be done; this lets us un-indent almost everything. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>