Structs in C
... and a bit on how to implement single linked lists with them.
Here the link to VCF East 2019 -- Brian Kernighan interviews Ken Thompson and to the pioneers of C and Unix:
-
Ken Thompson: Designed and implemented Unix
-
Brian Kernighan: Contributed to the development of Unix and C. Author of the book The C Programming Language which had a big impact on making C popular.
-
Dennis Ritchie: Created the C programming language.
Exercise: ABC Compiler Project
In the header lexer.h change the declarations
1 2 3 | extern int token_kind;
extern size_t token_line;
extern size_t token_col;
|
to
1 2 3 4 5 6 7 8 9 10 | struct Token
{
enum TokenKind kind;
struct TokenPos
{
size_t line, col;
} pos;
};
extern struct Token token;
|
or (this is equivalent)
1 2 3 4 5 6 7 8 9 10 | struct Token
{
enum TokenKind kind;
struct
{
size_t line, col;
} pos;
};
extern struct Token token;
|
Modify you implementation of the lexer so that after a call of getToken() in the global variable token the member token.kind has the correct token kind.
Change the test program xtest_lexer.c for the lexer to
1 2 3 4 5 6 7 8 9 10 11 12 | #include <stdio.h>
#include "lexer.h"
int
main(void)
{
while (getToken() != EOI) {
printf("%zu.%zu: %s\n", token.pos.line, token.pos.col,
strTokenKind(token.kind));
}
}
|
Adapt your implementation in lexer.c accordingly. Commit this changes to your git repository.
Grammar
In Session 9 you saw the gammar for declarations in C. This is the part for structure type specifiers in C:
\[\begin{array}{rcl}\langle\text{structure-type-specifier}\rangle & \to & \textbf{struct}\; \langle\text{identifier}\rangle\; \textbf{\{}\; \langle\text{struct-declaration-list}\rangle\; \textbf{\}}\; \\ & \to & \textbf{struct}\; \textbf{\{}\; \langle\text{struct-declaration-list}\rangle\; \textbf{\}}\; \\ & \to & \textbf{struct}\; \langle\text{identifier}\rangle\; \\\langle\text{struct-declaration-list}\rangle & \to & \langle\text{struct-declaration}\rangle\; \\ & \to & \langle\text{struct-declaration-list}\rangle\; \langle\text{struct-declaration}\rangle\; \\\langle\text{struct-declaration}\rangle & \to & \langle\text{specifier-qualifier-list}\rangle\; \langle\text{struct-declarator-list}\rangle\; \textbf{;}\; \\\langle\text{specifier-qualifier-list}\rangle & \to & \langle\text{type-specifier}\rangle\; \\ & \to & \langle\text{type-specifier}\rangle\; \langle\text{specifier-qualifier-list}\rangle\; \\ & \to & \langle\text{type-qualifier}\rangle \\ & \to & \langle\text{type-qualifier}\rangle\; \langle\text{specifier-qualifier-list}\rangle \\\langle\text{struct-declarator-list}\rangle & \to & \langle\text{struct-declarator}\rangle \\ & \to & \langle\text{struct-declarator-list}\rangle\; \textbf{,}\; \langle\text{struct-declarator}\rangle \\\langle\text{struct-declarator}\rangle & \to & \langle\text{declarator}\rangle \\ & \to & \langle\text{declarator}\rangle\; \textbf{:}\; \langle\text{constant-expression}\rangle \\ & \to & \textbf{:}\; \langle\text{constant-expression}\rangle \\\end{array}\]Use Cases Not Shown in the Video
Some of this features were used but not explained in detail, some where mentioned and some you might see here the first time.
Initialized Definition
When you define a struct variable you can initialize all of its member or just the first few of them:
1 2 3 4 5 6 7 | struct Foo
{
int a, b;
};
struct Foo foo1 = {23, 42}; // initialize: foo1.a = 23, foo1.b = 42
struct Foo foo2 = {23}; // initialize: foo2.a = 23
|
It is not possible to skip in the initialization the first few and initialize just the rest.
If you do not initialize all members it matters whether the variable is global or local. If it is global all other members are zero initialized. For local variables they will have a undefined value (just like for uninitialized local variables in general).
For nested structs you can use nested initializer lists:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | struct Foo
{
int a, b;
struct Bar
{
long c;
char d;
} bar;
int e;
};
struct Foo foo = { 1, 2, { 3, 4, }, 5, };
/*
foo is initialized with:
foo.a = 1, foo.b = 2, foo.bar.c = 3, foo.bar.d = 4, foo.e = 5
*/
|
In initializer lists the last comma is optional.
Assignments
Like for “normal” variables you can use structured variables in assignments. In the assignment all members will be copied:
1 2 3 4 5 6 7 8 9 10 11 12 | struct Dummy
{
int a, b;
};
int
main(void)
{
Dummy foo, bar = {1, 2};
foo = bar; // assign all members: foo.a = bar.a, foo.b = bar.b
}
|
Function Parameters
Structs can be parameters of a function. The function receives a copy of the struct, i.e. we have “call by value” as always in C. For imitating a call by reference you have to pass a pointer:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #include <stdio.h>
struct Dummy
{
int a, b;
};
void
foo(struct Dummy dummy)
{
dummy.a = 12;
dummy.b = 34;
}
void
bar(struct Dummy *dummy)
{
(*dummy).a = 12;
(*dummy).b = 34;
}
int
main(void)
{
struct Dummy d = { 1, 2 };
foo(d);
printf("After 'foo(d)': d.a = %d, d.b = %d\n", d.a, d.b);
bar(&d);
printf("After 'foo(d)': d.a = %d, d.b = %d\n", d.a, d.b);
}
|
Here the generated output of the executable:
theon$ gcc struct_param.c theon$ ./a.out After 'foo(d)': d.a = 1, d.b = 2 After 'foo(d)': d.a = 12, d.b = 34 theon$
Recall the precedence of operators in C and in particular that postfix operators have a higher precedence than prefix operators:
-
The dereference operator ('*') is a prefix operator.
-
The member operator ('.') is a postfix operator.
Hence, the expression *dummy.a would be equivalent to *(dummy.a). This makes no sense: dummy is a pointer and has no member, so the compiler will complain about that for a good reason.
Indirect Member Access
The (postfix) operator for indirect member access ('->') is equivalent to first dereferencing a pointer and then access a member. Hence, dummy->a is equivalent to (*dummy).a. This makes code more expressive:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #include <stdio.h>
struct Dummy
{
int a, b;
};
void
foo(struct Dummy dummy)
{
dummy.a = 12;
dummy.b = 34;
}
void
bar(struct Dummy *dummy)
{
dummy->a = 12; // indirect member access
dummy->b = 34; // indirect member access
}
int
main(void)
{
struct Dummy d = { 1, 2 };
foo(d);
printf("After 'foo(d)': d.a = %d, d.b = %d\n", d.a, d.b);
bar(&d);
printf("After 'foo(d)': d.a = %d, d.b = %d\n", d.a, d.b);
}
|
Here the generated output of the executable:
theon$ gcc struct_param2.c theon$ ./a.out After 'foo(d)': d.a = 1, d.b = 2 After 'foo(d)': d.a = 12, d.b = 34 theon$
Return Value
Functions can also return a struct. The caller receives a copy of the returned structured variable:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | #include <stdio.h>
struct Dummy
{
int a, b;
};
struct Dummy
foo(void)
{
struct Dummy d = { 12, 34 };
return d;
}
int
main(void)
{
struct Dummy d = foo();
printf("d.a = %d, d.b = %d\n", d.a, d.b);
}
|
Here the generated output of the executable:
theon$ gcc struct_retval.c theon$ ./a.out d.a = 12, d.b = 34 theon$
Declaration and Definition
You can declare a structure Foo and define variables of type struct Foo in one sweep:
1 2 3 4 | struct Foo
{
int a, b;
} foo;
|
This is equivalent to
1 2 3 4 5 6 | struct Foo
{
int a, b;
};
struct Foo foo;
|
Anonymous Structs
This will define a variable foo of an anonymous struct:
1 2 3 4 | struct
{
int a, b;
} foo;
|
This for example makes sense if you need exactly one instance. Internally the compiler will give the struct declaration some unique tag, e.g.
1 2 3 4 | struct .SomeUniqueTag
{
int a, b;
} foo;
|
Anonymous Structs in Nested Structs
Anonymous struct can also be useful in nested structs if you just want more structure:
1 2 3 4 5 6 7 8 9 10 11 12 13 | struct Dummy
{
struct
{
int a, b;
} foo;
struct
{
int a, b;
} bar;
};
struct Dummy dummy;
|
Then dummy has members like dummy.foo.a and dummy.bar.a.