The JBits Bit-Serial Core Library provides the essential operators needed to hierarchically construct larger bit-serial cores with JBits.  Amongst these cores you will find adders, subtractors, twos complement, delay and other cores.  All of the cores accept a Least Significant Bit (LSB) first data input, where the least significant bit is marked by a pulse of the control signal for one bit-time (a clock cycle for this core library).

Documentation
     I have generated JavaDocs for all the cores/classes used by the cores (just click on index.html in the /doc directory). 
However reading the following couple of paragraphs will save you some serious debugging time.

Two of the same cores?
     You will notice that there are two different adder cores and two different subtractor cores.  This is very significant because while both adders and subtractors zero their carry or borrow flip-flops, when a control signal input is asserted high, one of the adders and one of the subtractors will also zero one of its inputs. 
     This is done to aid in the construction of serial-by-parallel constant coefficient multipliers.

Why two different adder and subtractor cores?
When a serial-by-parallel constant coefficient multiplier is constructed from the CarrySaveAdder-Slice and SubtractorSlice cores, the result from adjacent connected cores in the multiplier is ignored when the control signal is asserted high.  Ignoring the previous results prevents undesirable side effects from arithmetic operations for the previous data word processed by the multiplier. 
     In other situations, such as the addition of two multiplier outputs (ie the sum of 4*a+5*b), it is appropriate to use the TapAdder-Slice which does not ignore inputs.  The same holds true when performing subtractions with the TapSubtractor-Slice.

Cores that zero only their carry or borrow flip-flops

Cores that zero an input in addition to their carry or borrow flip-flops

TapAdderSlice
TapSubtractorSlice

CarrySaveAdderSlice
SubtractorSlice

     A good rule of thumb is to use the cores that ignore one of their inputs when you are constructing a larger core hierarchically and each internal core receives an identical control signal at the same time (see sidebar).  You can also tell if you have chosen the right core by simulation.

Latency
     All of the cores in the core library have a latency of one clock cycle except for the delay cores and the "bsMDShiftUL" core (bit-serial Multiple Down Shift Unified Library primitives core). 
     The delay cores have a latency that you can set from one to n clock cycles. 

Using bsMDShiftUL
     This core is useful for performing arbitrary arithmetic right shifts.  The BarDelay core must be used to provide a control signal to this core.  The BarDelay core should be triggered by a control signal that is one clock-cycle earlier than the control signal arriving with the LSB of the data input to the bsMDShiftUL core.  The barLength parameter of the BarDelay core controls the number of bits shifted right.  The latency of bsMDShiftUL is barLength+one.

Help, Comments, & Additional Info
     Email: aycarrei@shaw.ca