Computer Science Asked by A. Sallai on December 20, 2020
I’m writing my Bsc thesis about type systems of various languages and I want to have a short section about assembly languages. Initially I thought I’ll bring up assembly as a counter example to languages with advanced type systems. My goal was to explore the reasons why assembly omits most (all?) type system features but eventually I found papers about typed assembly languages and I started to get confused.
So here are my questions:
Here is a list of papers about typed assembly languages from the end of the last century: http://www.cs.cornell.edu/talc/papers.html.
Links to other papers are highly appreciated!
JVM byte code can be considered an assembly language, and it is most definitely typed. Classes, interfaces, exceptions are actual parts of that assembly language. Any system that executes JVM byte code will actually include a byte code verifier to check that all the instructions are properly typed.
Answered by gnasher729 on December 20, 2020
Machine language is untuned. It's simply numbers which are interpreted as actions by the main and ancillary processors.
The most basic kinds of assembly language follow this. But there is no reason that types can't be used. For example, you could declare a certain set of data as a string. Then the assembler can emit an error when you attempt to do arithmetic on this data.
Undoubtably, types were invented in the higher languages, and have migrated down the language stack as their utility began to be recognised.
Answered by Mozibur Ullah on December 20, 2020
If you mean "assembly" as, e.g. x86 assembly language, then I think yes, to some degree. Types are some constraints that we can statically checked/proved, then there is so little (but not nothing) we can do given an x86 assembly program, e.g.
add rcx, [@addr]
jmp rcx
so it's possible to infer that [@addr]
is a 64-bit memory values starting from @addr
, but it's very hard to statically check whether the next jmp rcx
is safe or not because we must know whether rcx
is a valid address which contains executable code.
Another example is
mov rax, @pointer0
mov rcx, @pointer1
add rax, rcx
suppose that @pointer0
and @pointer1
are valid pointers, then this program should not type-check because adding two pointers seems meaningless. But x86 assembly allows that.
If we add an "advanced" type system into x86 assembly language, then... it wouldn't be this language anymore (the type system isn't independent from the language).
Typed assembly language (TAL) means intermediate languages (not adding type system into existing x86 assembly language), its goals is (verbatim from the paper):
...to provide a fully automatic way to verify that programs will not violate the primitive abstractions of the language.
that means, given some program (at a high-level language) which has been type-checked to satisfy some properties, then when the program is compiled into another program (in TAL), we can still check (in TAL) that these properties are satisfied.
Answered by Ta Thanh Dinh on December 20, 2020
Many assembly languages do have certain features that could be considered static typing. Most often this is for making programming easier, rather than type checking.
In many assembly languages you can define the equivalent of C's struct
s and union
s. Many assembly languages also allow the usage of arrays, where the type (in the sense of byte-count) of the elements is determined at assemble time.
n.b. "x86 assembly" is a very vague term; every assembler has its own dialect. I'll refer to MASM, as it is the most featureful assembler.
It says a lot that MASM has the TYPEDEF
keyword. It also has the ASSUME
keyword which marks the value of a register as a pointer to a specific type which will affect later usages of the register. You can also use LENGTHOF
, SIZEOF
, and TYPE
for arrays which return statically-available information.
In MASM, there is a distinction between near
and far
addresses and variables can be declared with certain types. Structs can be defined with a particular alignment which influences how the fields are accessed. Structs, unions and arrays can be nested. Bitfield types also exist.
ARM assembly is intended to be a compiler target rather than convenient for humans (as far as assemblies go) so it doesn't have any of these features.
Regarding machine code itself: Theoretically it is possible for a computer architecture to use extra bits to determine the type of that word. Some old computer architectures did use an extra bit to distinguish data from pointers.
Answered by Artelius on December 20, 2020
Assembly language is normally untyped, in the sense that there is no type-checking. Adding type-checking is a non-trivial research challenge (hence the papers you see). Papers on typed assembly language should explain the motivation. One application is that they can be used to support proof-carrying code, which can be used to securely execute untrusted code. Another potential application is to support formal verification. But I suggest reading some of the key papers to see what they say about the applications and motivation for typed assembly language.
You can find other papers yourself by doing a literature search -- see https://crypto.stackexchange.com/q/8316/351.
Answered by D.W. on December 20, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP