Difference between revisions of "Tutorials:Value types"
(→Bits, Bytes, and Words) |
(→Bits, Bytes, and Words) |
||
Line 9: | Line 9: | ||
<blockquote style="color:gray;"> | <blockquote style="color:gray;"> | ||
− | Note: [https://wikipedia.org/wiki/Unicode Unicode] was introduced to handle multiple languages, and is based on ASCII. | + | Note: [https://wikipedia.org/wiki/Unicode Unicode] was introduced to handle multiple languages, and is based on [https://en.wikipedia.org/wiki/ASCII ASCII]. |
<br /> | <br /> | ||
− | Side note: ASCII was based on telegraph code, and started out as a 7 bit system. | + | Side note: [https://en.wikipedia.org/wiki/ASCII ASCII] was based on [https://en.wikipedia.org/wiki/Telegraph_code telegraph code], and started out as a 7 bit system. |
</blockquote> | </blockquote> | ||
Revision as of 16:15, 18 April 2018
In memory there really are no types, all value types are stored with bytes. It's more how the process uses the values that dictates it's type. Now the format for some types is a lot different, like in an integer 1 is 0x1 but in an ASCII string 1 is 0x31 (values written in a 0x* notation are in hexadecimal format).
Contents
Bits, Bytes, and Words
A bit is a binary digit. So a bit is a zero or a one. Bits are implemented in computer hardware using switches. If the switch is closed (on) then the bit is one and if the switch is open (off) then the bit is zero. A bit is limited to representing two values, since it's a base two.
Since the English alphabet contains more than two letters, a letter cannot be represented by a bit. A byte is a sequence of bits. Since the mid 1960's a byte has been 8 bits in length. 01000001 is an example of a byte. Since there are 8 bits in a byte there are 28 different possible sequences for one byte, ranging from 00000000 to 11111111. This means that a byte can be used to represent any type of value with no more than 28 = 256 possible values. Since the number of things that you can enter on a computer keyboard is smaller than 256 (including all key stoke pairs, like shift or control plus another key), a code for a key stoke is represented with a code within a byte.[1]
Note: Unicode was introduced to handle multiple languages, and is based on ASCII.
Side note: ASCII was based on telegraph code, and started out as a 7 bit system.
Now you will tend to hear that all values are stored in hexadecimal format, but really it's that all computers will convert the stored binary to hexadecimal when displaying the data else it's alway binary bits (until quantum computers are standard).
Note: Hexadecimal is just a base 16 number system, decimal is a base 10, and binary is a base 2.
So bytes are like the base data units, and we can store any ASCII character in a byte. You'll very often come across size names like WORD and DWORD. "In computing, a word is the natural unit of data used by a particular processor design"[1]. And that's the definition that was used for assembly initially as well. When computers used 8 bit processors a WORD was 1 byte, when they were 16 bit a WORD was 2 bytes, however computers starting becoming really popular around the time of 32 bit processors and for maximum compatibility assemblers stopped using that definition and just stuck with a WORD being 2 bytes. So even though the natural unit for a 32 bit processor is 4 bytes and 8 for a 64 bit processor a WORD is always 2 bytes and a DWORD (double word) is always 4 bytes.
Signs
In memory there is no direct way to represent a negative number, this is done in varying ways to simplify the arithmetic.
An unsigned byte (not allowing negative numbers) can hold 0 to 255, while a signed byte (allowing negative numbers) can hold -128 to 127.[2]
Floating points
In memory there are no fractions nor decimal points, but a floating decimal point system was setup. It uses an approximation so as to support a trade-off between range and precision. Now there are other ways of representing a fractional numbers like fixed point, binary coded decimal, or logarithmic number systems. But the floating point representation is by far the most common way of representing an approximation of real numbers in a computer's memory, but there are both single and double precision floating points.[3]
Value sizes
So these are the standard value sizes:
- Bit
- A binary digit, the smallest unit of data in a computer's memory.
- Byte
- A group bits (usually eight), operated on as a unit.
- WORD - Word
- 2 Bytes (16 bits).
- DWORD - Double word
- 4 Bytes (32 bits).
- QWORD - Quadword
- 8 Bytes (64 bits).
These are some of the larger sizes:
- TWORD - Ten byte
- 10 Bytes (80 bits).
- OWORD - Octoword
- 16 Bytes (128 bits).
- YWORD
- 32 Bytes (256 bits).
- ZWORD
- 64 Bytes (512 bits).
Value types
So these are the standard value types:
- Bit
- Unsigned integer.
- Byte
- (Un)signed integer.
- 2 byte - WORD
- (Un)signed integer.
- 4 Bytes - DWORD
- (Un)signed integer.
- 8 Bytes - QWORD
- (Un)signed integer.
- Float - DWORD
- Single precision floating point.
- Double - QWORD
- Double precision floating point.
- Text / String
- A string of text characters (any length).
- Can be ASCII or Unicode (wide) encoded.
- Array of bytes / AOB
- An array of bytes (any length).
- Represented as a string of bytes.
Going further
Sources
- www.cs.scranton.edu/~cil102/data_bits.html
- wikipedia.org/wiki/Signed_number_representations
- wikipedia.org/wiki/Floating-point_arithmetic