CAD minor edits and clarifications

mikera · mikera · commit 458c31f763d8 · 2024-08-27T18:07:02.000+01:00
diff --git a/cad/002_values/README.md b/cad/002_values/README.md
@@ -1,6 +1,15 @@
 # CVM Values
 
-Convex depends on a consistent representation of information values that are used within the CVM and Convergent Proof Of Stake Consensus. This document describes the values used and key design requirements that specify the available data types in the CVM and Convex Peers / Clients.
+Convex uses a special representation of information values within the CVM, across the data lattice and to support the Convergent Proof Of Stake Consensus. 
+
+Values in Convex are special for a number of reasons:
+- They are **pure immutable values** well suited for use in **functional programming**
+- They are designed for **efficient encoding** and network transmission
+- They form **Merkle trees** supporting cryptographic verification
+- They implement **orthogonal persistence**: automatically migrate between stem main memory and disk as required
+- They support **structural sharing**, making operations such as taking snapshots or the entire CVM state possible in O(1) time
+
+It is fair to say that Convex wouldn't be possible without this powerful and flexible implementation of data values. This document describes the values used and key design requirements that specify the available data types in the CVM and Convex Peers / Clients.
 
 ## Motivation
 
@@ -68,23 +77,31 @@ This also ensures that Peers can safely store multiple versions of large data st
 
 ### Canonical Encoding
 
-All CVM values MUST have a unique canonical **Encoding** as a fixed length sequence of bytes. See [Encoding CAD](/cad/003_encoding/README.md) for full specification.
+All CVM values MUST have a unique canonical **encoding** as a fixed length sequence of bytes. See [Encoding CAD](/cad/003_encoding/README.md) for full specification.
 
 CVM values are **defined to be equal** if and only if their Encoding is identical.
 
 ### Value ID
 
-Each unique CVM value is defined to have a **Value ID** that is equal to the SHA3-256 hash of the value's Encoding.
+Each unique CVM value is defined to have a **Value ID** that is equal to the SHA3-256 hash of the value's encoding.
 
-The Value ID is important, since it makes it possible to refer to Values using a relatively small fixed-length reference type. 
+The Value ID is extremely important, because:
+- It makes it possible to refer to values using a small fixed-length reference suitable for content-addressable storage
+- The Value ID makes it possible to cryptographically verify that content is correct in an efficient way (since it acts as the Merkle root of the value when seen as Merkle tree).
 
 ## Types
 
 ### Primitive Types
 
+#### Integers
+
+An integer is a a whole number (positive or negative) as commonly defined in arithmetic.
+
+Convex allows big integers up to the size of 32768 bits, i.e. around `1.4*10^9864`. This may be extended in the future, though we haven't found a sensible use case that is likely to require integers this large.
+
 #### Long
 
-A Long is a 64-bit, signed integer.
+A Long is a 64-bit, signed integer. Longs are the subset of integers within this 64-bit range. For efficiently reasons, Convex automatically uses longs in place of big integers where possible: from a developer perspective, there is usually no need to distinguish between the two.
 
 Examples:
 
@@ -98,6 +115,12 @@ Longs are the natural representation of small integer values within a fixed rang
 
 Longs are also used to represent quantities of native Convex Coins (which by the definition of the 10^18 max supply cap, are guaranteed to fit in 64 bits and not overflow when value quantities are added or subtracted).
 
+#### Byte
+
+A Byte is an 8-bit, unsigned integer. From a developer perspective, they can be generally be considered simply as longs in the in the range 0-255.
+
+Bytes are useful for representing small integer values efficiently, such as a small set of flags or short codes. They are also important as the individual elements of Blob data (equivalent to immutable byte arrays). They are encoded as just 1-2 bytes of data, therefore recommended for very memory conscious applications.
+
 #### Double
 
 A Double is a 64-bit double precision floating point value as defined in the IEEE 754 standard.
@@ -111,17 +134,13 @@ Examples:
 ##Inf
 ```
 
-Doubles are suitable for many applications that need to represent numerical values that can be very large or very small, but do not need to maintain absolute precision beyond a certain number of decimal places. 
+Doubles are suitable for many applications that need to represent numerical values that can be very large or very small, but do not need to maintain precision beyond a certain number of decimal places. 
 
-While the lowest bits of precision may be lost, Double computations are still deterministic.
-
-Doubles support some special values as per the IEEE 754 standard: Positive infinity, negative infinity, negative zero and NaN (not a number).
-
-#### Byte
+The maximum IEEEE 754 double value is around `1.7976931348623157*10^308`. 
 
-A Byte is an 8-bit, unsigned integer.
+While the lowest bits of precision may be lost, double computations are still deterministic.
 
-Bytes are suitable for representing small integer values efficiently, such as a small set of flags or short codes. They are also important as the individual elements of Blob data (equivalent to immutable byte arrays)
+Doubles also support some special values as per the IEEE 754 standard: Positive infinity, negative infinity, negative zero and NaN (not a number).
 
 #### Character
 
@@ -131,7 +150,7 @@ A Character can map to 1-4 bytes in UTF-8 encoding. For maximum efficiency, char
 
 #### Boolean
 
-A Boolean type contains only two values `true` and `false`.
+A Boolean value is one of the two values `true` and `false`.
 
 In addition to their utility in general purpose programming, `true` and `false` are particularly efficient in the CVM, requiring only 1 byte of Encoding.
 
@@ -153,11 +172,11 @@ Examples:
 #666
 ```
 
-Addresses are logically equivalent 63-bit positive integers, though they are not intended for use in calculation. Note that Longs could have been used for this purpose, however a specialised Address value type has some additional advantages:
+Addresses can be considered equivalent to 63-bit positive integers, though they are not intended for use in calculation. Note that Longs could have been used for this purpose, however a specialised Address value type has some additional advantages:
 
 - A separate notation for Addresses makes them more clearly visible in code.
 - We can apply additional security validation and prevent some user errors (e.g. getting argument orders wrong and passing an asset quantity instead of an address which might produce unexpected results...)
-- The implementation can be made slightly more optimised
+- The implementation can be made more optimised
 
 #### Blob
 
@@ -171,11 +190,11 @@ Examples
 0x                                                                    ;; The empty Blob (0 bytes)
 ```
 
-Blobs are especially useful for storing opaque units of data that may be important to external systems (e.g. client data encodings) as well as cryptographic values such as keys, hashes or verification proofs. While is is possible to manipulate Blobs in CVM code, this is not usually recommended: such handling should normally be done off-chain.
+Blobs are especially useful for storing opaque units of data that may be important to external systems (e.g. client data encodings) as well as cryptographic values such as keys, hashes (including value IDs) or verification proofs. While is is possible to manipulate Blobs in CVM code, this is not usually recommended: such handling should normally be done off-chain.
 
 #### String
 
-A String is a sequence of bytes intended to represent the UTF-8 character encoding of text.
+A String is a sequence of bytes intended to represent the UTF-8 character encoding of text. 
 
 Examples:
 
@@ -184,6 +203,8 @@ Examples:
 ""                ;; The empty string
 ```
 
+Internally, storage and management of Strings is very similar to Blobs.
+
 #### Symbol
 
 A Symbol is a identifier used to name things: values stored in an Account environment, or meaningful symbolic values in code.
@@ -196,7 +217,7 @@ count
 hello
 ```
 
-Symbols are 1-128 byes long, expressed in UTF-8 encoding. 
+Symbols are 1-128 bytes long, expressed in UTF-8 encoding. 
 
 Symbols have special behaviour when evaluated: they perform a lookup of the value named by the symbol in the current environment. If this behaviour is not desired, they should be **quoted** with `'` to specify that the actual symbol is required, not the referenced value. An example of this usage:
 
@@ -211,7 +232,7 @@ a
 => a                               ;; No lookup is performed for quoted symbol
 ```
 
-Internally Symbols *may* contain arbitrary characters (including badly formed UTF-8), but some of these may not read correctly in an off-chain Parser - therefore it is up to users to ensure that the Symbols they define are readable if this is a requirement.
+Internally, Symbols *may* contain arbitrary characters (including badly formed UTF-8), but some of these may not read correctly in an off-chain Parser - therefore it is up to users to ensure that the Symbols they define are readable if this is a requirement.
 
 #### Keyword
 
@@ -303,8 +324,6 @@ Indexes can be created using the core function `index`
 (index)
 ```
 
-
-
 #### Set
 
 A Set is a data structure that contains zero or more values as **members** of the set. 
@@ -327,13 +346,15 @@ Records behave like Maps when accessed using their field names as keys mapped to
 
 #### Block
 
+A block is a group of transactions submitted by a peer to the network. 
+
+Unlike blockchains, Convex does not require blocks to be chained to the previous block via a hash - which allows them to be created and submitted in parallel. They are best thought of as groups of contiguous transactions submitted by the same peer in the ordering.
+
 #### Account
 
-An Account record represents information regarding the current state of an Account. This includes:
+An Account record represents information regarding the current state of an Account. 
 
-| Key                    | Type    | Description |
-| ---                    | ----    | ----        |
-| :sequence              | Long    | The current sequence number. Next transaction must have this value plus one |
+See CAD004 for more details of the specification and contents of Accounts
 
 
 #### Peer
@@ -359,28 +380,13 @@ The State represents a total global State of the CVM. This includes
 - Global settings and status flags
 - The Schedule
 
-### Transaction Types
-
-Transaction types represent instructions to Convex that can be submitted by external Clients.
-
-#### Invoke
-
-An `Invoke` transaction is a request to execute some CVM code by a User Account. This is the most general type of transaction: any CVM code may be executed.
-
-#### Call
-
-A `Call` is a transaction requesting the execution of a callable function (typically a smart contract entry point) from a user Account.
-
-Semantically, this is roughly equivalent to using an `Invoke` transaction to do the following:
-
-`(call target-address (function-name arg1 arg2 .... argN)`
+#### Transaction Types
 
-`Call` transaction types are mainly intended as an efficient way for user applications to invoke smart contract calls on behalf of the User.
+Transaction types represent instructions to Convex that can be submitted by external Clients. Transactions are specialised record types.
 
+For more details see CAD010.
 
-#### Transfer
 
-A `Transfer` is a transaction requesting the transfer of Convex Coins from a User Account to some other Account. 
 
 ## Implementation notes
 
diff --git a/cad/010_transactions/README.md b/cad/010_transactions/README.md
@@ -15,14 +15,13 @@ The general lifecycle of a transaction is as follows:
 
 1. Client constructs a transaction containing the desired instruction to the network
 2. Client signs the transaction using a private Ed25519 key
-3. The signed transaction is submitted to a Peer
-4. The Peer incorporates the transaction into a Belief, which is propagated to the network
+3. The signed transaction is submitted to a peer of the client's choosing
+4. The peer incorporates the transaction into a Belief, which is propagated to the network
 5. The transaction is confirmed in consensus according to the CPoS algorithm
-6. The Peer computes the effect of the transaction on the CVM state, and any result(s)
-7. Peer returns a confirmed transaction result to the Client
+6. The peer computes the effect of the transaction on the CVM state, and any result(s)
+7. Peer returns a confirmed transaction result to the client
  
 
-
 ## Transaction Types
 
 All signed transactions MUST contain at least the following fields:
@@ -32,28 +31,30 @@ All signed transactions MUST contain at least the following fields:
 
 ### Transfer
 
-A Transfer Transaction causes a transfer of Convex Coins from the origin account to a destination account
+A `Transfer` is a transaction requesting the transfer of Convex Coins from a user (origin) account to some other (target) account. 
 
-A Transfer Transaction MUST specify an amount to transfer, as an integer.
+A transfer transaction MUST specify an amount to transfer, as an integer.
 
-The Source Account MUST be the Origin Account for the Transaction, i.e. transfers can only occur from the Account which has the correct Digital Signature
+The Source Account MUST be the origin account for the transaction, i.e. transfers can only occur from the account which has the correct digital signature
 
-Both Accounts MUST be valid, otherwise the Transaction MUST fail
+Both accounts MUST be valid, otherwise the transaction MUST fail
 
-The Transaction MUST fail if any of the following are true:
+The transaction MUST fail if any of the following are true:
 - The source Account has insufficient balance to pay for Transfer Transaction fees.
 - The transferred Amount is negative
 - The transferred Amount is greater than the Convex Coin Balance of the source Account (after subtracting any Transfer Transaction Fees)
 
-If the Transfer Transaction does not fail for any reason, then:
+If the transfer transaction does not fail for any reason, then:
 - The Amount MUST be subtracted from the Source Account's Balance
 - The Amount MUST be added to the Destination Account's balance
 
-A transfer amount of zero will succeed, though this is relatively pointless. Users SHOULD avoid submitting such transfers, unless they are willing to pay transaction fees simply to have this recorded in consensus.
+A transfer amount of zero will succeed, though this is relatively pointless. Users SHOULD avoid submitting such transfers, unless there is a good reason (e.g. public proving the ability to transact with a given account).
 
 ### Invoke
 
-An Invoke Transaction causes the execution of CVM Code
+An `Invoke` transaction is a request to execute some CVM code by a user account. This is the most general type of transaction: any CVM code may be executed.
+
+An Invoke transaction causes the execution of CVM Code when successfully signed and submitted to the Convex network
 
 An Invoke Transaction MUST include a payload of CVM Code. This may be either:
 - A pre-compiled CVM Op
@@ -70,6 +71,14 @@ Otherwise, the CVM State MUST be updated by the result of executing the CVM Code
 
 ### Call
 
+A `Call` is a transaction requesting the execution of a callable function (typically a smart contract entry point) from a user account.
+
+Semantically, this is broadly equivalent to using an `Invoke` transaction to do the following:
+
+`(call target-address (function-name arg1 arg2 .... argN)`
+
+`Call` transaction types are mainly intended as an efficient way for user applications to invoke smart contract calls on behalf of the User.
+
 A Call Transaction causes the invocation of an Actor function.
 
 Apart from lower transaction fees, the Call instruction MUST be functionally equivalent to invoking CVM Code of the form:
@@ -107,6 +116,17 @@ Transaction results MUST be returned in a `Result` record which contains the fol
 
 An an optimisation, peers MAY avoid creating `Result` records if they have no requirement to report results back to clients.
 
+### Verification
+
+If the client trusts the peer, the returned result may be assumed as evidence that the transaction has succeeded. 
+
+If there are doubts about the integrity of the peer, further verification may be performed in several ways:
+- Checking the consensus ordering to ensure that the transaction occurred when the peer claimed
+- Querying the CVM state to ensure transaction effects have been carried out
+- Confirming the result with one or more independent peers  
+
+It is generally the responsibility of the user / app developer to choose an appropriate level of verification and ensure connection to trusted peers.
+
 ## Peer Responsibilities
 
 Peers are generally expected to be responsible for validating and submitting legitimate transactions for consensus on behalf of their clients.
@@ -117,7 +137,7 @@ Peers SHOULD submit legitimate transaction for consensus, unless they have a rea
 
 Peers SHOULD submit transactions in the order that they are received from any single client. Failure to do so is likely to result in sequence errors and potential economic cost for the peers.
 
-Peers SHOULD validate the digital signature of transactions they include in a block. Failure to do so is likely to result in penalities (at a minimum, paying the fees for the invalid transaction) 
+Peers SHOULD validate the digital signature of transactions they include in a block. Failure to do so is likely to result in penalties (at a minimum, paying the fees for the invalid transaction) 
 
 Peers MAY reject transactions that do not appear to be legitimate, in which case the Peer SHOULD return a Result to the Client submitting the transaction indicating the reason for rejection. Some examples where this may be appropriate:
 - Any transaction that has an obviously invalid sequence number (less than that required for the current Consensus State)