
Study the pin layout and signal paths before attempting to read the processor schematic. The first commercial single-chip CPU released in 1971 contains about 2,300 transistors built with a 10-micrometer silicon gate process and packaged in a 16-pin dual in-line package. Understanding how those pins connect to memory, clock sources, and I/O expanders helps interpret the wiring plan used in early microcomputer designs.
The chip operates with a 4-bit data path and addresses program memory through a multiplexed bus shared with instructions and data. Program storage typically relied on ROM devices such as 4001 units, while RAM modules like the 4002 provided registers and status storage. Communication across the bus occurs through time-multiplexed signals driven by two phase clock inputs labeled Φ1 and Φ2. These timing lines coordinate instruction fetch, decoding, and arithmetic steps inside the processor.
Internal organization includes an arithmetic logic unit, a 16-register array arranged as sixteen 4-bit registers, a control section for instruction decoding, and a stack used for subroutine return addresses. The stack stores up to three levels of program counter values. Data transfer between internal blocks occurs through a 4-bit internal bus synchronized with the clock generator chip commonly paired with the processor.
Early calculator systems used this architecture by connecting the CPU to external ROM, RAM, and I/O devices through shared control lines such as SYNC, CM-ROM, and CM-RAM. Reading the wiring scheme with attention to these signals reveals how the processor selects memory banks, moves data between registers, and executes instructions within a few microseconds per operation.
Internal functional blocks of Intel 4004 shown in the processor circuit structure

Examine the arithmetic logic unit first, because it forms the center of the processor layout. This 4-bit unit performs addition, subtraction through complement operations, logical AND, OR, and data transfers. The ALU operates on values stored in a 16-register array and processes one 4-bit word during each instruction phase controlled by a two-phase clock. Carry propagation occurs through a dedicated carry flip-flop, allowing multi-nibble arithmetic across sequential operations.
The register array contains sixteen 4-bit registers arranged as pairs. Each pair can form an 8-bit address pointer used during memory access. Register selection takes place through a decoder connected to the instruction control block. This arrangement allows instructions such as register exchange, increment, and load operations without external memory traffic.
Instruction decoding and control logic

Instruction decoding uses a control matrix composed of transistor logic gates. The processor recognizes a 46-instruction set stored in external program memory. Each opcode triggers a sequence of micro-operations controlling register access, ALU functions, and bus direction. Timing signals derived from Φ1 and Φ2 determine which stage of the instruction cycle is active.
The control section also generates synchronization signals used by external memory chips. One line indicates instruction fetch cycles, while others enable ROM or RAM modules connected through the shared data path. This approach allows a small package with only sixteen pins to communicate with several external devices.
Program counter and stack mechanism

The program counter spans 12 bits, supporting addressing of up to 4 KB of program storage. It advances after each instruction fetch and can jump to new locations through branch instructions. Internally the counter is divided into three 4-bit sections connected through carry logic.
A three-level stack stores return addresses during subroutine calls. Each level holds a full program counter value. When a call instruction executes, the current address moves into the stack while program flow jumps to the target routine. A return instruction restores the stored address back into the counter.
Data transfer between functional blocks occurs through a 4-bit internal bus. Multiplexing hardware connects the bus alternately to the register array, ALU output, and external data pins. Clock phases coordinate when each block can drive the bus.
The full silicon layout integrates about 2,300 MOS transistors fabricated with a 10-micrometer process. Each functional region–ALU, registers, control logic, and program sequencing hardware–occupies a dedicated area on the die connected by metal interconnect lines forming the processor wiring scheme.