Outline

Number Systems
Computer Architecture
Java Virtual Machine
Algorithmic Correctness and Complexity

Numbers

Binary conversion

Foundation

To prove

$x^0 = 1$ $x^0 = x^{y-y}$ $x^{y-y} = x^y/x^y$ $x^y/x^y = 1$

Numbers that we generally make use of are written in the decimal number system, which has ten digits (0 … 9).

In decimal the number 957 can be written as

$\begin{aligned} &(9 \times 10^2 ) + (5 \times 10^1 ) + (7 \times 10^0 ) \\ =~&900 ~~~~~~~~~~+ 50 ~~~~~~~~~~~~+ 7 \end{aligned}$

The number “10” (called the base or radix) can, in fact, be any number.

It turns out that using Base 2 (or Binary) is easy for computers.

Base 2 has two digits which can easily be represented on a computer by two states ON and OFF.

Binary (Base 2)

The expanded notation for Binary [Base 2, uses two digits (0,1)]

Consider: 1011

$\begin{aligned} &(1 \times 2^3)+ (0 \times 2^2 ) + (1 \times 2^1 ) + (1 \times 2^0)\\ =~&8~~~~~~~~~~~~+0~~~~~~~~~~~~+2~~~~~~~~~~~~+1\\ =~&11 ~~~~~~~~~~~~~decimal \end{aligned}$

Consider: 1010

$\begin{aligned} &(1 \times 2^3)+ (0 \times 2^2 ) + (1 \times 2^1 ) + (0 \times 2^0)\\ =~&8~~~~~~~~~~~~+0~~~~~~~~~~~~+2~~~~~~~~~~~~+0\\ =~&10 ~~~~~~~~~~~~~decimal \end{aligned}$

Octal (Base 8)

The expanded notation for Octal [Base 8, which uses 8 digits (0,1,2,3,4,5,6,7)

Consider: 1237

$\begin{aligned} &(1 \times 8^3)+ (2 \times 8^2 ) + (3 \times 8^1 ) + (7 \times 8^0)\\ =~&512~~~~~~~~+128~~~~~~~~+24~~~~~~~~~~+7\\ =~&671 ~~~~~~~~~~~~~decimal \end{aligned}$

Consider: 7717

$\begin{aligned} &(7 \times 8^3)+ (7 \times 8^2 ) + (1 \times 8^1 ) + (7 \times 8^0)\\ =~&3584~~~~~~+448~~~~~~~~+8~~~~~~~~~~~~+7\\ =~&4047 ~~~~~~~~~~~~~decimal \end{aligned}$

Hexadecimal (Base 16)

The expanded notation for Hexadecimal [Base 16], which uses 16 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F).

Consider: CBF

$\begin{aligned} &(C \times 16^2)+ (B \times 16^1 ) + (F \times 16^0 ) \\ =~&3072~~~~~~~~~+176~~~~~~~~~~~+15\\ =~&3263 ~~~~~~~~~~~~~decimal \end{aligned}$

Converting

Converting Base 10 to Base 2

Some form of division …

Convert $40_{10}$ to Binary

$\begin{aligned} 2&|&40&|&0&\\ 2&|&20&|&0&\\ 2&|&10&|&0&\\ 2&|&5& |&1&\\ 2&|&2& |&0&\\ 2&|&1& |&1&\\ &&0&|& &\\ \end{aligned}$ $40_{10} = 101000_2$

Converting Base 10 to Base 8 and Base 16

Divide by 8 or 16 instead of 2

Converting Base 2 to Base 8

What is the largest number of “things” you can count using 3 digits in Binary

$\begin{aligned} &111\\ &{[111]}_2={[7]}_{10} \end{aligned}$

Max number of “things” you can count using one digit in Base 8

$\begin{aligned} &7\\ &{[7]}_8={[7]}_{10} \end{aligned}$

How many digits in Base 8 are needed to represent 3 digits of base 2

Consider $100111_2$ to Base 8

$\begin{aligned} &100&111& \\ &4&7& \end{aligned}$

Therefore $100111_2 = 47_8$

Consider $1100111_2$ to Base 8

$\begin{aligned} &001&&100&&111&\\ &1&&4&&7& \end{aligned}$

Therefore $1100111_2 = 147_8$

$\begin{aligned} {[1100111]} _2~=~&2^6+2^5+0+0+2^2+2^1+2^0\\ =~&64+32+0+0+4+2+1\\ =~&{[103]}_{10}\\ \\ {[147]}_8=~&1\times8^2+4\times8^1+7\times8^0\\ =~&1\times64+4\times8+7\\ =~&[103]_{10} \end{aligned}$

Converting Base 8 to Base 2

Consider 333 8 to Base 2

$\begin{aligned} &3&&3&&3&\\ &011&&011&&011& \end{aligned}$

Therefore $333_8 = 11011011_2$

Converting Base 2 to Base 16

4 ‘places’ instead of 3

Positive Real Numbers Converting

Convert $0.1011_2$ to Decimal

Expanded notation of $0.53_{10}$

$\begin{aligned} &(5 \times 10^{-1})+ (3 \times 10^{-2} ) \\ =&.5~~~~~~~~~~~~~~~+.03\\ \end{aligned}$

Similarly using the expanded notation for $0.1011_2$.

$\begin{aligned} &(1 \times 2^{-1})+ (0 \times 2^{-2} ) + (1 \times 2^{-3} ) + (1 \times 2^{-4})\\ =~&1/2~~~~~~~~~~+0/4~~~~~~~~~~+1/8~~~~~~~~~~+1/16\\ =~&11/16\\ =~&0.6875 ~~~~~~~~~~~~~~~decimal \end{aligned}$

Converting 0.625 to Binary

$\begin{aligned} &&(0.625)&|&\times2&\\ 1&&(0.250)&|&\times2&\\ 0&&(0.500)&|&\times2&\\ 1&&(0.000)&|&&\\ \end{aligned}$

So $0.625_{10} = 0.101_2$

Binary Arithmetic

Addition

Remember to carry 1

$\begin{aligned} &&0111\\ &&+0001\\ Carry&&1110\\ &&=1000 \end{aligned}$

Subtraction

Remember to borrow at 1

1000 - 0001

$\begin{aligned} &&1000\\ After~Borrow&&100(10)\\ -&&0001\\ =&&0111 \end{aligned}$

Physical Storage

These numbers are stored in the computer’s memory.
Each number represents one ‘bit’.
Computers used to use 8 bits (called a byte) but now use much larger memory units.
The largest binary number you can represent using a byte is 1111 1111 or $255_{10}$ .

What if we had to add the following numbers on a computer which stores numbers as bytes:
- 0111 1111 + 1
- 1111 1111 + 1
This is called overflow.
Notice how you will always have an overflow regardless of the number of bits available.

More Arithmetic

What if we had to subtract the following numbers:

$0111 1111 - 1111 1111$

We need a way of storing negative numbers.

Simplest way to do this could be to assign the first bit to represent sign, called the sign bit.

Differentiating between overflow and negative numbers?

What is the problem?
Overflow flag

Representing Fractions

Decide where the fraction begins:
- [sign] xxxx.xxx
- [sign] xxx.xxxx
Called ‘fixed point representation’
- Wastes space
- Trade Off between precision and range

Representing Characters

All Characters have an associated numeric value.
This is called character encoding.
One early character encoding is ASCII
- “American Standard Code for Information Interchange”
This did not contain languages other than English.
To overcome this, we now use UTF-8

By now, you should

Know what is binary, octal, hexadecimal
Be able to convert numbers from base[2,8,10,16] ⟷ base[2,8,10,16]
Be able to add and subtract in binary
Know how letters are stored on a computer
Know how numbers are stored on a computer (physically).

Storing Negative Numbers

The use of a ‘sign bit’ to store negative numbers has problems:

Let’s use 4 bits (a nibble) to represent numbers with only a sign bit:

$\begin{aligned} &&{[+2]}_{10}→[0]{[010]}\\ &&{[-2]}_{10}→[1]{[010]}\\ &&Sum→[1]{[100]}\\ Or&&[-4]_{10}&& Wrong \end{aligned}$

The use of a ‘sign bit’ to store negative numbers has problems:
Once again, let’s use 4 bits to represent numbers with only a sign bit. What are these numbers:
- [0] [000]
- [1] [000]
Having two representations of the same number (0) should be avoided.

2’s complement

In addition to a sign bit, an n-bit negative number is also stored in 2’s complement
- The 2’s complement of a n-bit number is its complement wrt $2^n$
- which is: $2^n - (n~bit~number)$
- which is: n-bit number with bits “flipped” +1
The 2’s complement of a n-bit number is its complement wrt $2^n$
- 0001’s 2’s complement is (when using a 4-bit word):
  - $2^4 - 0001 = 10000 - 0001 = 1111$
  - Or $0001 → 1110 + 1 = 1111$
Another example:
- 0101’s 4-bit 2’s complement is:
  - $2^4 - 0101 = 10000 - 0101 = 1011$
  - Or $0101 → 1010 + 1 = 1011$
Another example 4-bit 2’s complement:
- 3 + (-3)
- 0 011 +
  
  1 101
- 1 0 000
0 000 is zero, what about 1 000?
Another example:
- $-2 + -1$
- $+2 → [0 010]; -2 → [1 101 + 1] → [1 110]$
- $+1 → [0 001]; -1 → [1 110 + 1] → [1 111]$
- $\begin{aligned} carry&&11100\\ &&1110&+\\ &&1111&\\ &&11101& \end{aligned}$
- $-3 → [ 1 100 + 1] → 1 101$
Another example (8-bit word):
- -2: 1 1111110
- -6: 1 1111010
- -2 + -6 :
- $\begin{aligned} &&1 &&1111110&+\\ &&1 &&1111010&\\ carry&&1 1 &&1111100&\\ &&1 1 &&1111000& \end{aligned}$
- -8: 1 1111000
It is important to differentiate between an overflow and a carry.
- This can be done using the XOR of the two numbers.

Fixed Point Representation

Numerical precision
- The precision of a numeric value describes the number of digits that are used to show that value.
- Only values that are an integer multiple of the smallest power of two can be represented exactly.
Numerical range
- Increase in precision is at the expense of range (given the same word).

Floating point representation

A method that prevents the architecture of the system from deciding on the tradeoff between precision and range
- This is left to the programmer

Standard Form (Scientific Notation) - Decimal

- (significand or mantissa) $\times$ 10 $^ {(exponent)}$

NOTE: Do not confuse this mantissa with the mantissa in logarithms.

$100=1\times10^2$
$123=1.23\times10^2$
$123.4=1.234\times10^2$
$0.002=2\times10^{-3}$

Standard Form - Binary

$100=1\times2^2$ ←MUST be in Binary
$100=1\times2^{10}$
$101.01=1.0101\times2^{10}$
$0.001=1\times2^{-11}$

Floating point - Binary

Consider an 8-bit word:
- [sign bit] [sign of exponent]
  - [exponent] →3 bits
  - [(fractional) mantissa] →3 bits
The official IEEE definition is for 32 and 64 bits。
The exponent is an offset - We do NOT consider this.
- 01 011 000
- 0 1 011 100
- 0 0 010 010
sign sign [Zeros at Start] [Zeros at End]
WRONG!!! Zero
- 0 0 000 000
- 1 0 000 000
Zero → Correct
- 0 1 000 000
- 1 1 000 000
There are two representations for zero

Floating point - Binary (Addition)

0 1 011 000 + 0 1 010 100 = 0 1 001 000
- 0 1 011 000
- 0 1 010 100
- 0 1 001 000

Overflow vs Carry in 2’s complement

Carry:

$\begin{aligned} &1~111~~+ \\ &1~100 \\ (1)&1~011 \end{aligned}$

Overflow:

$\begin{aligned} (-7)~~1001& \\ +~(-6)~~1010& \\ (-13)~~0011&~=~3 \end{aligned}$

Computer Architecture

From Circuits to Binary

Summing two binary numbers:

0 + 0 = (0) 0

1 + 0 = (0) 1

0 + 1 = (0) 1

1 + 1 = (1) 0

From Circuits to Binary

Summing two binary numbers:
- 0 + 0 = (0) 0 (series) parallel
- 1 + 0 = (0) 1 (and) or
- 0 + 1 = (0) 1
- 1 + 1 = (1) 1
1 + 1 = (1) 1
- Using only series and parallel circuits would not work as shown by this last case. This last case also requires something called “NOT”. What we have explored, however, is an illustration of how circuits can be used for binary arithmetic.

Non-numbers

Non-numbers are represented as numbers before storing.
A common mapping is called the ASCII table.
The extended ASCII table and UTF-8 includes non-english characters.

Little man computer

The Little Man Computer (LMC) is an instructional model of a computer, created by Dr. Stuart Madnick in 1965. The LMC is generally used to teach students, because it models a simple von Neumann architecture computer—which has all of the basic features of a modern computer. It can be programmed in machine code (albeit in decimal rather than binary) or assembly code.

The LMC model is based on the concept of a little man shut in a closed mail room (analogous to a computer in this scenario). At one end of the room, there are 100 mailboxes (memory), numbered 0 to 99, that can each contain a 3 digit instruction or data (ranging from 000 to 999). Furthermore, there are two mailboxes at the other end labeled INBOX and OUTBOX which are used for receiving and outputting data. In the center of the room, there is a work area containing a simple two function (addition and subtraction) calculator known as the Accumulator and a resettable counter known as the Program Counter. The Program Counter holds the address of the next instruction the Little Man will carry out. This Program Counter is normally incremented by 1 after each instruction is executed, allowing the Little Man to work through a program sequentially. Branch instructions allow iteration (loops) and conditional programming structures to be incorporated into a program. The latter is achieved by setting the Program Counter to a non-sequential memory address if a particular condition is met (typically the value stored in the accumulator being zero or positive).

As specified by the von Neumann architecture, each mailbox (signifying a unique memory location) contains both instructions and data. Care therefore needs to be taken to stop the Program Counter from reaching a memory address containing data - or the Little Man will attempt to treat it as an instruction. One can take advantage of this by writing instructions into mailboxes that are meant to be interpreted as code, to create self-modifying code. To use the LMC, the user loads data into the mailboxes and then signals the Little Man to begin execution, starting with the instruction stored at memory address zero. Resetting the Program Counter to zero effectively restarts the program, albeit in a potentially different state.

Execution cycle

To execute a program, the little man performs these steps:

Check the Program Counter for the mailbox number that contains a program instruction (i.e. zero at the start of the program)
Fetch the instruction from the mailbox with that number. Each instruction contains two fields: An opcode (indicating the operation to perform) and the address field (indicating where to find the data to perform the operation on).
Increment the Program Counter (so that it contains the mailbox number of the next instruction)
Decode the instruction. If the instruction utilises data stored in another mailbox then use the address field to find the mailbox number for the data it will work on, e.g. ‘get data from mailbox 42’)
Fetch the data (from the input, accumulator, or mailbox with the address determined in step 4)
Execute the instruction based on the opcode given
Branch or store the result (in the output, accumulator, or mailbox with the address determined in step 4)
Return to the Program Counter to repeat the cycle or halt

Commands

While the LMC does reflect the actual workings of binary processors, the simplicity of decimal numbers was chosen to minimize the complexity for students who may not be comfortable working in binary / hexadecimal

Instructions

Some LMC simulators are programmed directly using 3-digit numeric instructions and some use 3-letter mnemonic codes and labels. In either case, the instruction set is deliberately very limited (typically about ten instructions) to simplify understanding. If the LMC uses mnemonic codes and labels then these are converted into 3-digit numeric instructions when the program is assembled.

The table below shows a typical numeric instruction set and the equivalent mnemonic codes.

Instruction	Mnemonic	MachineCode
Load	LDA	5xx
Store	STA	3xx
Add	ADD	1xx
Subtract	SUB	2xx
Input	INP	901
Output	OUT	902
End	HLT	000
Branch if zero	BRZ	7xx
Branch if zero or positive	BRP	8xx
Branch always	BRA	6xx
Data storage	DAT	-

Numeric code	Mnemonic code	Instruction	Description
1xx	ADD	ADD	Add the value stored in mailbox xx to whatever value is currently on the accumulator (calculator).Note: the contents of the mailbox are not changed, and the actions of the accumulator (calculator) are not defined for add instructions that cause sums larger than 3 digits. Similarly to SUBTRACT, one could set the negative flag on overflow.
2xx	SUB	SUBTRACT	Subtract the value stored in mailbox xx from whatever value is currently on the accumulator (calculator).Note: the contents of the mailbox are not changed, and the actions of the accumulator are not defined for subtract instructions that cause negative results - however, a negative flag will be set so that 7xx (BRZ) and 8xx (BRP) can be used properly.
3xx	STA	STORE	Store the contents of the accumulator in mailbox xx (destructive).Note: the contents of the accumulator (calculator) are not changed (non-destructive), but contents of mailbox are replaced regardless of what was in there (destructive)
5xx	LDA	LOAD	Load the value from mailbox xx (non-destructive) and enter it in the accumulator (destructive).
6xx	BRA	BRANCH(unconditional)	Set the program counter to the given address (value xx). That is, value xx will be the next instruction executed.
7xx	BRZ	BRANCH IF ZERO (conditional)	If the accumulator (calculator) contains the value 000, set the program counter to the value xx. Otherwise, do nothing. Whether the negative flag is taken into account is undefined. When a SUBTRACT underflows the accumulator, this flag is set, after which the accumulator is undefined, potentially zero, causing behavior of BRZ to be undefined on underflow. Suggested behavior would be to branch if accumulator is zero and negative flag is not set.Note: since the program is stored in memory, data and program instructions all have the same address/location format.
8xx	BRP	BRANCH IF POSITIVE (conditional)	If the accumulator (calculator) is 0 or positive, set the program counter to the value xx. Otherwise, do nothing. As LMC memory cells can only hold values between 0 and 999, this instruction depends solely on the negative flag set by an underflow on SUBTRACT and potentially on an overflow on ADD (undefined).Note: since the program is stored in memory, data and program instructions all have the same address/location format.
901	INP	INPUT	Go to the INBOX, fetch the value from the user, and put it in the accumulator (calculator)Note: this will overwrite whatever value was in the accumulator (destructive)
902	OUT	OUTPUT	Copy the value from the accumulator (calculator) to the OUTBOX.Note: the contents of the accumulator are not changed (non-destructive).
000	HLT/COB	HALT/COFFEE BREAK	Stop working/end the program.
	DAT	DATA	This is an assembler instruction which simply loads the value into the next available mailbox. DAT can also be used in conjunction with labels to declare variables. For example, DAT 984 will store the value 984 into a mailbox at the address of the DAT instruction.

Examples

Using Numeric Instruction Codes

This program (instruction 901 to instruction 000) is written just using numeric codes. The program takes two numbers as input and outputs the difference. Notice that execution starts at Mailbox 00 and finishes at Mailbox 07. The disadvantages of programming the LMC using numeric instruction codes are discussed below.

Mailbox	Numeric code	Operation	Comments
00	901	INBOX —> ACCUMULATOR	INPUT the first number, enter into calculator (erasing whatever was there)
01	308	ACCUMULATOR —> MEMORY[08]	STORE the calculator’s current value (to prepare for the next step…)
02	901	INBOX —> ACCUMULATOR	INPUT the second number, enter into calculator (erasing whatever was there)
03	309	ACCUMULATOR —> MEMORY[09]	STORE the calculator’s current value (again, to prepare for the next step…)
04	508	MEMORY[08] —> ACCUMULATOR	(Now that both INPUT values are STORED in Mailboxes 08 and 09…)LOAD the first value back into the calculator (erasing whatever was there)
05	209	ACCUMULATOR = ACCUMULATOR - MEMORY[09]	SUBTRACT the second number from the calculator’s current value (which was just set to the first number)
06	902	ACCUMULATOR —> OUTBOX	OUTPUT the calculator’s result to the OUTBOX
07	000	(no operation performed)	HALT the LMC

Using Mnemonics and Labels

Assembly language is a low-level programming language that uses mnemonics and labels instead of numeric instruction codes. Although the LMC only uses a limited set of mnemonics, the convenience of using a mnemonic for each instruction is made apparent from the assembly language of the same program shown below - the programmer is no longer required to memorize a set of anonymous numeric codes and can now program with a set of more memorable mnemonic codes. If the mnemonic is an instruction that involves a memory address (either a branch instruction or loading/saving data) then a label is used to name the memory address.

INP
STA FIRST
INP
STA SECOND
LDA FIRST
SUB SECOND
OUT
HLT
FIRST DAT
SECOND DAT

Labels

Without labels the programmer is required to manually calculate mailbox (memory) addresses. In the numeric code example, if a new instruction was to be inserted before the final HLT instruction then that HLT instruction would move from address 07 to address 08 (address labelling starts at address location 00). Suppose the user entered 600 as the first input. The instruction 308 would mean that this value would be stored at address location 08 and overwrite the 000 (HLT) instruction. Since 600 means “branch to mailbox address 00” the program, instead of halting, would get stuck in an endless loop.

To work around this difficulty, most assembly languages (including the LMC) combine the mnemonics with labels. A label is simply a word that is used to either name a memory address where an instruction or data is stored, or to refer to that address in an instruction.

When a program is assembled:

A label to the left of an instruction mnemonic is converted to the memory address the instruction or data is stored at. i.e. loopstart INP
A label to the right of an instruction mnemonic takes on the value of the memory address referred to above. i.e. BRA loopstart
A label combined with a DAT statement works as a variable, it labels the memory address that the data is stored at. i.e. one DAT 1 or number1 DAT

In the assembly language which uses mnemonics and labels, if a new instruction was inserted before the final HLT instruction then the address location labelled FIRST would now be at memory location 09 rather than 08 and the STA FIRST instruction would be converted to 309 (STA 09) rather than 308 (STA 08) when the program was assembled.

Labels are therefore used to:

identify a particular instruction as a target for a BRANCH instruction.
identify a memory location as a named variable (using DAT) and optionally load data into the program at assembly time for use by the program (this use is not obvious until one considers that there is no way of adding 1 to a counter. One could ask the user to input 1 at the beginning, but it would be better to have this loaded at the time of assembly using one DAT 1)

Example

This program will take a user input, and count down to zero.

     INP
     OUT      // Initialize output 
LOOP BRZ QUIT // If the accumulator value is 0, jump to the memory address labeled QUIT
     SUB ONE  // Label this memory address as LOOP, The instruction will then subtract the value stored at address ONE from the accumulator
     OUT
     BRA LOOP // Jump (unconditionally) to the memory address labeled LOOP
QUIT HLT      // Label this memory address as QUIT
ONE  DAT 1    // Store the value 1 in this memory address, and label it ONE (variable declaration)

This program will take a user input, square it, output the answer and then repeat. Entering a zero will end the program.
(Note: an input that results in an output greater than 999 will cause an error due to the LMC 3 digit number limit).

START   LDA ZERO     // Initialize for multiple program run
        STA RESULT
        STA COUNT
        INP          // User provided input
        BRZ END      // Branch to program END if input = 0
        STA VALUE    // Store input as VALUE
LOOP    LDA RESULT   // Load the RESULT
        ADD VALUE    // Add VALUE, the user provided input, to RESULT
        STA RESULT   // Store the new RESULT
        LDA COUNT    // Load the COUNT
        ADD ONE      // Add ONE to the COUNT
        STA COUNT    // Store the new COUNT
        SUB VALUE    // Subtract the user provided input VALUE from COUNT
        BRZ ENDLOOP  // If zero (VALUE has been added to RESULT 
        			 // by VALUE times), branch to ENDLOOP
        BRA LOOP     // Branch to LOOP to continue adding VALUE to RESULT
ENDLOOP LDA RESULT   // Load RESULT
        OUT          // Output RESULT
        BRA START    // Branch to the START to initialize 
        			 // and get another input VALUE
END     HLT          // HALT - a zero was entered so done!
RESULT  DAT          // Computed result (defaults to 0)
COUNT   DAT          // Counter (defaults to 0)
ONE     DAT 1        // Constant, value of 1
VALUE   DAT          // User provided input, the value to be squared (defaults to 0)
ZERO    DAT          // Constant, value of 0 (defaults to 0)

Note: If there is no data after a DAT statement then the default value 0 is stored in the memory address.

In the example above, [BRZ ENDLOOP] depends on undefined behaviour, as COUNT-VALUE can be negative, after which the ACCUMULATOR value is undefined, resulting in BRZ either branching or not (ACCUMULATOR may be zero, or wrapped around). To make the code compatible with the specification, replace:

...
LDA COUNT    // Load the COUNT
ADD ONE      // Add ONE to the COUNT
STA COUNT    // Store the new COUNT
SUB VALUE    // Subtract the user provided input VALUE from COUNT
BRZ ENDLOOP  // If zero (VALUE has been added to RESULT 
			 // by VALUE times), branch to ENDLOOP
...

with the following version, which does VALUE-COUNT instead of COUNT-VALUE, making sure the accumulator never underflows:

...
LDA COUNT    // Load the COUNT
ADD ONE      // Add ONE to the COUNT
STA COUNT    // Store the new COUNT
LDA VALUE    // Load the VALUE
SUB COUNT    // Subtract COUNT from the user provided 
			 // input VALUE
BRZ ENDLOOP  // If zero (VALUE has been added to RESULT
			 // by VALUE times), branch to ENDLOOP
...

Another example is a quine, printing its own machine code (printing source is impossible because letters cannot be outputted):

LOAD LDA 0     // Load position 0 into the accumulator. 
			   // This line will be modified on each loop 
			   // to load the next lines into the accumulator
     OUT       // Output the accumulator's value. The accumulator's 
      		   // value will be the line that was just loaded
     SUB ONE   // Subtract 1 from the value in the accumulator. 
      		   // This is so we can do the BRZ in the next step 
      		   // to see if we are on the last line in the program
     BRZ ONE   // If the previous subtraction has made the 
      		   // accumulator 0 (which means we had the value 001 
      		   // in the accumulator), then branch to position ONE
     LDA LOAD  // Load the LOAD position into the accumulator, 
      		   // this is in preparation to increment the address
               // digits for this position
     ADD ONE   // Increment the position digits for the LOAD line. 
      		   // The value currently in the accumulator would, 
      		   // if read as an instruction, load the next line 
      		   // into the accumulator, compared to the last 
      		   // line loaded
     STA LOAD  // Store the newly incremented LOAD line back 
      		   // in the LOAD position
     BRA LOAD  // Return to the beginning of the loop
ONE  DAT 1     // The variable 1. If read as an instruction, 
 			   // this will be interpreted as HLT/COB and will 
 			   // end the program

This quine works using self-modifying code. Position 0 is incremented by on each loop, outputting that line’s code, until the code it is outputting is 1, at which point it branches to the ONE position. The ONE position begins with a 0, so it is interpreted as a HALT/COB instruction.

Von Neumann architecture

Instruction	Mnemonic	MachineCode
Load	LDA	5xx
Store	STA	3xx
Add	ADD	1xx
Subtract	SUB	2xx
Input	INP	901
Output	OUT	902
End	HLT	000
Branch if zero	BRZ	7xx
Branch if zero or positive	BRP	8xx
Branch always	BRA	6xx
Data storage	DAT	-

Von Neumann architecture - CPU

The Central Processing Unit (CPU)
Consists of
- Control Unit
- Arithmetic and Logical Unit (ALU)
- Registers
  - Program counter
  - Instruction Register
  - Address Register
  - Accumulator
Bus
- Control bus
- Data bus
- Address bus
RAM
Input/Output
Clock Cycle
- A signal that oscillates between high and low used to synchronise (coordinate) actions.
- The number of cycles that a CPU uses per second is used to determine its speed.
  - This is measured in megahertz (MHz) and gigahertz (GHz)
Instruction Cycle
- Is the execution process which consists of:
  - Fetch - fetching the relevant instruction.
  - Decode - decode the instruction to determine what to do.
  - Execute - The actual execution of the statement.
Also called the fetch-decode-execute cycle or the fetch-execute cycle.

Fetch

Inspect the program counter to find the address of the next instruction
Load the next instruction from memory into the instruction register
Update the program counter to point at the next instruction

Decode

Determine the type of instruction fetched
If the instruction requires data from memory, determine its address

Execute

Fetch any required data from memory into one of the CPU registers
Execute the instruction
Fetch the next instruction …

Abstraction

Variables
Branching to Loops
Functions of Subroutines

The Little Man Computer

The Little Man Computer - Add

// Output the sum of two numbers 
INP 
STA 99 
INP 
ADD 99 
OUT 
HLT

The Little Man Computer - Add (Variable)

// Output the sum of two numbers
INP 
STA FIRST
INP 
ADD FIRST 
OUT 
HLT 
FIRST DAT

LMC - Branching

        INP
        STA FIRST
        INP
        STA SECOND
        INP
        STA THIRD
        BRZ ADDNUMS
        LDA FIRST
        SUB SECOND
        OUT
        HLT
ADDNUMS	LDA FIRST
        ADD SECOND
        OUT
        HLT
        FIRST DAT
        SECOND DAT
        THIRD DAT

00 INP
01 STA FIRST
02 INP
03 STA SECOND
04 INP
05 STA THIRD
06 BRZ 11
07 LDA FIRST
08 SUB SECOND
09 OUT
10 HLT
11 LDA FIRST
12 ADD SECOND
13 OUT
14 HLT
15 FIRST DAT
16 SECOND DAT
17 THIRD DAT

LMC - Branching [Cleaner code 1]

        INP
        STA FIRSTINP
        INP
        STA SECONDINP
        INP
        STA WHAT_TO_DO
        BRZ ADDNUMS
        LDA FIRSTINP
        SUB SECONDINP
        OUT
        BRA END
ADDNUMS	LDA FIRSTINP
        ADD SECONDINP
        OUT
END 	HLT
        FIRSTINP DAT
        SECONDINP DAT
        WHAT_TO_DO DAT

LMC - Branching [Cleaner code 2]

            INP
            STA FIRSTINP
            INP
            STA SECONDINP
            INP
            STA WHAT_TO_DO
            BRZ ADD_NUMS
            BRA SUB_NUMS
            BRA END
ADD_NUMS 	LDA FIRSTINP
            ADD SECONDINP
            OUT
            BRA END
SUB_NUMS 	LDA FIRSTINP
            SUB SECONDINP
            OUT
            BRA END
END 		HLT
            FIRSTINP DAT
            SECONDINP DAT
            WHAT_TO_DO DAT

LMC - Loops

     INP
     STA A
LOOP OUT
     ADD A
     BRA LOOP
     A DAT

    INP
    STA A
L 	OUT
    ADD A
    STA B
    SUB C
    BRP E
    LDA B
    BRA L
E 	HLT
A 	DAT
B 	DAT
C 	DAT 100

LMC Practice Problems

Write a program to divide the first input number by the second input number.

NOTE: Division by repeated subtraction
Write a program to multiply two input numbers (by repeated addition)
Write a program to generate the first 10 Fibonacci numbers:

Babbage’s Analytical Engine

The Analytical engine was a proposed mechanical general-purpose computer
designed by English mathematician and computer pioneer Charles Babbage.

High-Level Programming Languages

The compiler

A program that converts programs written in “high-level” languages to machine code.

Linking

A program might make use of libraries - code prewritten to perform specific tasks, or OS specific libraries - often contained in several different files.
- A Linker, links “symbols” (variables and function names) across such files.

Parsing

Parsing is the process of going through a program and ensuring that the syntax of the program is correct.
- Both compilers and interpreters perform this step.
In addition to checking for correctness the parse also extracts information from the program (such as variables, constants, … )

Interpreter

An interpreter is a program that performs what a program written in a high-level language expresses without first converting it into machine code.
- Some interpreters convert programs into an intermediary representation before “executing” the program.

Loader

Loads programs and data to memory for execution.
Often a sub-task of the OS

The Role of the Operating System

Loading and Scheduling
“Virtual Memory”

The problem with security

A need for “Better” programs online
Downloadable programs
- Security
The “Jail”
Platform independence

JVM

The Java Virtual Machine
- What is a virtual machine
- Why do we need a virtual machine
Security through a virtual machine
Platform independence

Java Virtual Machine

Foundation

Lists or Arrays

A sequential “list” of variables
- Used to be restricted by the type of variable
Removes the need to remember independent variables
Variables of a single “type” can be clubbed together

Elements in an Array

If a is an array
a[0] is an element
0 is the index
0 can be replaced by any number upto length - 1
0 can be replaced by a variable.

Memory Leaks

Data structure

Stacks

Abstraction
Like a stack of paper
First In Last Out (FILO)
Often implemented through lists
Add element by “Pushing”
Remove element by “Popping”

Queues

Abstraction
Like standing in a Queue
First in First out (FIFO)
Has the same push and pop functions available
Implemented through a list

Functions

What is a function?
Functions as bricks
Why functions?
“Calling a function”
Passing values to a function
Return Values
Sub-programs
Methods

Calling Functions

Storing the values of passed “parameters”
Recursive functions - functions that call themselves
The call stack
Returning values
Returning to where the function was called from

Passing parameters by Value and Reference

Editable vs non-editable parameters
A copy vs a reference to the memory location
Which is which

The Call Stack

Recursion

Recursion (adjective: recursive) occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics and computer science, where a function being defined is applied within its own definition. While this apparently defines an infinite number of instances (function values), it is often done in such a way that no loop or infinite chain of references can occur.

General Notes on Programming

Writing clear and understandable code is more important that using “smart tricks”
Always indent code.
Always use variable names that make sense.
- Same with method/function names
If you are asked to write a program in this course - you WILL lose points for not
indenting, not using comments, not using good variable/method names

Mathematical Expressions - BODMAS

$1+3\times2$ $1-7+8$

Infix:

$1+(3 \times 2)$ $1-7+8$

Know as “infix expression”

Reverse Polish notation

Reverse Polish notation (RPN), also known as Polish postfix notation or simply postfix notation, is a mathematical notation in which operators follow their operands.

$((15 \div (7 − (1 + 1))) \times 3) − (2 + (1 + 1))$ $15~~7~~1~~1~~+~~− ~~\div ~~3 ~~\times ~~2~~1~~1~~+~~+~~−$

Examples

Example 1:
- $5 + 3 \times 6 \times ( 8 \times 2 + 1 )$
- $5~~3~~6~~\times~~8~~2~~\times~~1~~+~~\times~~+$
Example 2:
- $1 + 2 \times 6 \div (8 + 32) \times 4$
- $1~~2~~6~~\times~~8~~3~~2~~+~~\div~~4~~\times~~+$

Shunting-yard algorithm

Converts infix to postfix (Reverse Polish Notation)

Developed by Edsger Dijkstra who formalised computer science in the early 1970s.

While there are tokens to be read:
	Read a token
 	If it's a number add it to queue
	If it's an operator
		While there's an operator on the top of the stack with greater precedence:
			Pop operators from the stack onto the output queue
		Push the current operator onto the stack
	If it's a left bracket push it onto the stack
	If it's a right bracket
		While there's not a left bracket at the top of the stack:
			Pop operators from the stack onto the output queue.
		Pop the left bracket from the stack and discard it
While there are operators on the stack, pop them to the queue

while (there are tokens to be read) {
    Read a token;

    if (it is a number) {
        add it to queue;
    }

    if (it is an operator) {
        while (there is an operator on the top 
                    of the stack with greater precedence) {
            Pop operators from the stack onto the output queue;
        }

        Push the current operator onto the stack;
    }

    if (it is a left bracket) {
        push it onto the stack;
    }

    if (it is a right bracket) {
        while (there is not a left bracket at the top of the stack) {
            Pop operators from the stack onto the output queue;
        }

        Pop the left bracket from the stack and discard it;
    }
}

while (there are operators on the stack) 
    pop them to the queue;