Barcode Scanners Are Keyboards With Extra Steps
On this page
Back in 2019 I was building a warehouse management system for a client, as a sole dev. One core workflow involved parcel orders: a worker would point an EAN-13 barcode scanner at an order sheet, and the scanned code would land in the web app I built for them. The warehouse management app would then use that identifier to look up the items associated with the order, their quantities, and the fastest pick-up route across the warehouse layout.
I had never held a barcode scanner in my hand before.
I expected it to need an SDK, or perhaps a WebUSB integration. Instead, the client handed me a USB barcode scanner and said: “just plug it in”.
This device was pre-configured in “keyboard mode”: as long as you have a focused <input> field in the browser, you can scan a barcode, and see its number appear character by character, followed by an Enter keypress.
That experience made me curious about the full pipeline. How does a pattern of black and white lines encode a number without ambiguity? How do the scanner’s optics decode it? How does the scanner present itself as a keyboard? Let’s find out.
What Is a Barcode?
A barcode is essentially an image that encodes data as a pattern of vertical lines. Each narrow strip is either black or white, representing 1s and 0s. Scanners read this pattern optically, converting it into a number or string.
The barcodes relevant to this article are one-dimensional, and are the ones you will find in any retail store or warehouse facility. Though the original concept dates back to the late ’40s, barcodes were de facto standardized for the US market in 1973, with the introduction of the Universal Product Code (UPC). UPC-A supports exactly 12 decimal digits, while EAN-13, the barcode used in the rest of the world, supports 13 decimal digits.
UPC-A: The First Barcode Standard
A UPC-A barcode encodes exactly 12 decimal digits. Out of these, the 12th digit is a check digit, i.e., an arithmetic safeguard against scanning errors.
A module is the narrowest bar or space in the barcode, the fundamental unit of the encoding. The physical barcode contains exactly 95 modules. Each module is either black (a bar) or white (a space). The modules are organized as follows:
- Start guard:
101(bar-space-bar, 3 modules) - Left 6 digits: 6 × 7 = 42 modules, encoded using L-codes
- Center guard:
01010(space-bar-space-bar-space, 5 modules) - Right 6 digits: 6 × 7 = 42 modules, encoded using R-codes
- End guard:
101(bar-space-bar, 3 modules)
Each digit maps to a 7-module binary pattern. This mapping is actually a bijection, as it is unambiguous in both directions.
The left-side digits use L-codes (odd parity, meaning an odd number of 1-bits), while the right-side digits use R-codes (even parity). R-codes are the bitwise complement of L-codes. This is on purpose: the parity difference lets scanners determine whether they are reading the barcode forwards or backwards. The worker holding the barcode scanner does not need to worry about orientation.
Each digit in a UPC-A barcode is encoded as a 7-module pattern of bars (black) and spaces (white). Left-side digits use L-codes; right-side digits use R-codes, which are the bitwise complement.
Toggle a digit and switch between L-code and R-code to see the pattern:
L-codes start with 0 (space) and end with 1 (bar). Odd parity, so the scanner knows this is a left-side digit.
Check Digit in UPC-A
The 12th digit is computed using a weighted Modulo 10 algorithm. The idea: multiply each of the first 11 digits by an alternating weight, sum the results, and find the value that brings the total up to the nearest multiple of 10. That value is the check digit.
Odd positions (1st, 3rd, 5th, 7th, 9th, 11th) get weight 3. Even positions (2nd, 4th, 6th, 8th, 10th) get weight 1. Take the barcode 042000032339. First, sum the odd-position digits and multiply by 3. Then sum the even-position digits.
Add the two sums together:
The check digit is whatever value brings 31 up to the next multiple of 10, which is 40. Formally:
The outer "" handles the edge case where the sum is already a multiple of 10, resulting in the check digit being 0, rather than an overflowing 10.
If even a single digit is misread, the check digit will mismatch, causing the scanner to emit an error.
EAN-13: The International Standard
While UPC-A was the first barcode standard, its usage is mostly limited to the US and some Commonwealth countries. European facilities, including my client’s warehouse, use EAN-13 instead.
EAN-13 encodes 13 decimal digits instead of 12, but here is the constraint that makes the design clever: it uses the exact same 95-module physical layout. No extra bars, no wider barcode. The 13th digit is encoded implicitly.
The G-Code Trick
We just saw that UPC-A uses two encoding tables: L-codes for the left side, R-codes for the right side. EAN-13 introduces a third table, G-codes, which are R-codes read in reverse (equivalently, G-codes are mirrored L-code patterns). The right side of an EAN-13 barcode still uses R-codes, identical to UPC-A. The difference lies entirely on the left side.
In UPC-A, all six left-side digits use L-codes. EAN-13 changes this: each of the six left-side digits uses either an L-code or a G-code, and the specific sequence of L’s and G’s implicitly encodes the leading digit, i.e., the 13th number. This is the digit that never appears as a physical bar on the barcode.
There are 10 possible L/G sequences, one per leading digit value (0-9). When the leading digit is 0, the sequence is all L’s, which produces the exact same encoding as UPC-A. This has a very convenient implication: any valid UPC-A barcode is also a valid EAN-13 with a leading 0: the physical barcode is identical, bit for bit.
Pick a leading digit and a digit value below to explore how the three encoding tables interact. L-codes and G-codes both have odd parity and start with 0; R-codes have even parity and start with 1. A scanner distinguishes L from R by the leading bit, then distinguishes L from G by pattern matching. This three-way distinction is what makes bidirectional scanning work for EAN-13.
EAN-13 encodes 13 digits in the same 95 modules as UPC-A. The leading digit is not printed as bars; instead, it is encoded implicitly through the L/G parity pattern of the six left-side digits.
Check Digit in EAN-13
EAN-13 uses the same weighted Modulo 10 algorithm as UPC-A, but the weights are swapped: odd positions get weight 1, even positions get weight 3. The check digit is the 13th digit; the first 12 digits are used to compute it.
Consider the EAN-13 barcode 4006381333931:
Add the two sums together:
The check digit brings 89 up to the next multiple of 10, which is 90. Formally:
The familiar simplicity of this algorithm is intentional: EAN-13 was designed as a backward-compatible superset of UPC-A, not a replacement. And, the more years pass, the more I appreciate the attention for retro-compatibility in the standards we use every day.
How Do Barcode Scanners Work?
A barcode scanner does not “see” digits. It does not understand what 042000032339 means.
It measures bars and spaces, their widths and transitions, and looks up the resulting pattern in a table.
The scanner shines a light across the barcode. In traditional scanners, this is a laser beam, but if you bought a barcode reader recently, it probably uses an LED array. Black bars absorb most of the light, while white spaces reflect it. A photodetector captures the reflected light and converts it into an electrical signal: high voltage for spaces (bright reflections), low voltage for bars (dim reflections).
A threshold then binarizes the signal, converting the continuous waveform into a stream of 1s and 0s. The decoder measures the width of each bar and space, segments the binary stream into 7-module chunks, and matches each chunk against the known L-code and R-code tables. Guard patterns mark the start, middle, and end of the barcode.
A laser scan line sweeps across the barcode. Black bars absorb light (low reflectance); white spaces reflect it (high reflectance). The scanner measures these transitions and decodes the binary pattern.
I am glossing over much of the analog signal processing here, but you get the gist. Real scanners deal with ink bleed, skewed scan angles, damaged labels, and variable print quality: challenges that’d deserve a post of their own. But that’s for another day.
Now that we understand how scanners work, let’s look at how barcode data can flow to a web app.
The Keyboard Wedge
As it turns out, a USB barcode scanner does not use a special protocol. It acts as a standard USB HID keyboard, the same device class as the keyboard on your desk. The operating system loads its generic keyboard driver automatically. The OS handles everything: no WebUSB, no special permissions to grant, no fuss.
Whenever the scanner decodes a barcode, it translates each character into the corresponding USB HID key-press and key-release events.
The character 0 becomes HID scan code 0x27 (key down), then 0x27 (key up).
The character 4 becomes 0x21 down, 0x21 up.
At the end, most scanners send HID scan code 0x28, the Enter key.
The name “keyboard wedge” comes from the PS/2 era. Early barcode scanners had two connectors: one plugged into the computer’s keyboard port, and the keyboard plugged into the scanner. The scanner sat physically wedged between the keyboard and the computer. When it decoded a barcode, it injected the characters directly into the keyboard data stream. Modern USB scanners achieve the same effect over HID.
The upside is obvious: zero configuration, universal OS compatibility, works anywhere that accepts keyboard input. Plug it in, focus a field, scan, done.
The downside is just as obvious: the data goes wherever the input focus happens to be.
If the wrong application or the wrong field is focused when the scanner fires, the barcode ends up in the wrong place. There is no way to validate or route the data before it enters the application.
A USB barcode scanner acts as a keyboard. It "types" the decoded string into whatever input field has focus, character by character, then presses Enter. Compare scanner speed (~25ms per key) with human typing (~180ms per key).
Handling Scanner Input in Web Apps
Since a barcode scanner produces standard keydown events, the browser cannot tell the difference between a human typing and a scanner firing. Both generate the same DOM events on the same focused element. The application has to distinguish them.
The most common approach is keystroke timing. A barcode scanner “types” characters with roughly 20–50ms between keystrokes. No human can sustain that speed. The pattern: buffer incoming characters, record timestamps, and when Enter arrives, check the average inter-keystroke interval. If it is below a threshold (say 50ms), treat the buffer as a barcode scan. Otherwise, it was a human.
// Detecting barcode scanner input via inter-keystroke timing
let buffer = '';
let firstKeyTime = 0;
const SCAN_THRESHOLD_MS = 50;
const MIN_BARCODE_LENGTH = 6;
document.addEventListener('keydown', (e) => {
const now = Date.now();
if (e.key === 'Enter') {
const avgInterval = buffer.length > 1
? (now - firstKeyTime) / (buffer.length - 1)
: Infinity;
if (buffer.length >= MIN_BARCODE_LENGTH && avgInterval < SCAN_THRESHOLD_MS) {
handleBarcodeScan(buffer);
e.preventDefault();
}
buffer = '';
return;
}
if (e.key.length === 1) {
if (buffer.length === 0) firstKeyTime = now;
buffer += e.key;
}
}); Prefix/suffix detection is another option: configure the scanner to send a special character (e.g., Ctrl+B) before the barcode data and Enter after it. The application listens for the prefix and captures everything until the suffix. Hidden input field: keep an off-screen <input> element permanently focused; all scanner input goes there, and a listener processes the value.
For production use, onScan.js handles scanner detection: keystroke speed analysis, prefix/suffix matching, and mobile paste-mode scanners.
Beyond UPC-A and EAN-13
UPC-A and EAN-13 dominate retail, but other 1D barcode symbologies exist for different use cases.
Code 128 can encode the full ASCII character set. It uses three code sets (A, B, C) and can switch between them mid-barcode. Code Set C is particularly clever: it encodes digit pairs (00-99), achieving double-density numeric encoding. This is why shipping labels and logistics barcodes tend to use Code 128.
Code 39 is the workhorse of military and automotive applications. Each character is encoded as 9 elements (5 bars and 4 spaces), with exactly 3 of the 9 being wide. Hence the name “Code 3 of 9.” It is self-checking and requires no mandatory check digit, which made it popular in environments where simplicity mattered more than density.
The GS1 Sunrise 2027 initiative is pushing retailers to support 2D codes (QR, Data Matrix) at the point of sale alongside traditional 1D barcodes.
Conclusion
What felt like magic when I first plugged in that barcode scanner is really a layered stack of encoding, optics, signal processing, and USB emulation. Every link in that chain is simple. UPC-A encoding is a lookup table. The scanning pipeline is threshold-based binarization. The keyboard wedge is USB HID. The elegance is in how they compose:
No special protocol required.
Further Reading
- Universal Product Code (Wikipedia): full encoding tables and specification details
- First grocery scan, Wrigley’s gum, June 26, 1974 (HISTORY)
- The History of the Bar Code (Smithsonian Magazine)
- onScan.js (GitHub): hardware barcode scanners in web apps
- GS1 barcode types and Sunrise 2027