OverIQ.com

Union Basics in C

Last updated on July 27, 2020


Let's say you are creating a program to record the name and quantity of different goods, where quantity might be count, weight or volume. One way to approach the problem is to create structure as follows:

1
2
3
4
5
6
7
8
9
struct goods
{
    char name[20];
    int count;
    float weight;
    float volume;
};

struct goods balls = {"balls", 10};

As we know balls quantity is measured using count. So, in this case, there is no need for weight and volume.

Similarly in the following statement:

struct goods flour = {"balls", 0, "20"};

As the quantity of flour is measured using weight. So, in this case, there is no need to store count and volume.

From these observations, we can conclude that, a particular type of goods at a time can be measured using only one of the quantity either a count or a weight or a volume.

At this point our program has following limitations:

  • It takes more space than required, hence less efficient.
  • Someone might set more than one value.

It would be much more useful if we could record quantity using either a count, a weight, or a volume. That way we can save a lot of memory.

In C, a union allows us to do just that.

What is a Union? #

Like structures, unions are used to create new data types. It can also contain members just like structures. The syntax of defining a union, creating union variables and accessing members of the union is same as that of structures, the only difference is that union keyword is used instead of structure .

The important difference between structures and unions is that in structures each member has it's own memory whereas members in unions share the same memory. When a variable of type union is declared the compiler allocates memory sufficient to hold the largest member of the union. Since all members share the same memory you can only use one member of a union at a time, thus union is used to save memory. The syntax of declaring a union is as follows:

Syntax:

1
2
3
4
5
6
7
8
union tagname
{
    data_type member_1;
    data_type member_2;
    data_type member_3;
    ...
    data_type member_N;
};

Just like structure you can declare union variable with union definition or separately.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
union tagname
{
    data_type member_1;
    data_type member_2;
    data_type member_3;
    ... 
    data_type member_N;
} var_union;

union tagname var_union_2;

If we have a union variable then we can access members of union using dot operator (.) , similarly if we have pointer to union then we can access members of union using arrow operator (->) .

The following program demonstrates how to use a union.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include<stdio.h>

/*
union is defined above all functions so it is global.
*/

union data
{
    int var1;
    double var2;
    char var3;
};

int main()
{
    union data t;

    t.var1 = 10;
    printf("t.var1 = %d\n", t.var1);

    t.var2 = 20.34;
    printf("t.var2 = %f\n", t.var2);

    t.var3 = 'a';
    printf("t.var3 = %c\n", t.var3);

    printf("\nSize of structure: %d", sizeof(t));

    return 0;
}

Expected Output:

1
2
3
4
5
t.var1 = 10
t.var2 = 20.340000
t.var3 = a

Size of structure: 8

How it works:

In lines 7-12, a union data is declared with three members namely var1 of type int, var2 of type double and var3 of type char. When the compiler sees the definition of union it will allocate sufficient memory to hold the largest member of the union. In this case, the largest member is double, so it will allocate 8 bytes of memory. If the above definition would have been declared as a structure, the compiler would have allocated 13 bytes (8+4+2) of memory (here we are ignoring holes, click here to learn more about it).

In line 16, a union variable t of type union data is declared.

In line 18, the first member of t i.e var1 is initialized with a value of 10. The important thing to note is that at this point the other two members contain garbage values.

In line 19, the value of t.var1 is printed using the printf() statement.

In line 21, the second member of t i.e var2 is assigned a value of 20.34. At this point, the other two members contain garbage values.

In line 22, the value of t.var2 is printed using printf() statement.

In line 24, the third member of t i.e var3 is assigned a value of 'a' . At this point, the other two members contain garbage values.

In line 25, the value of t.var3 is printed using printf() statement.

In line 27, the sizeof() operator is used to print the size of the union. Since we know that, in the case of a union, the compiler allocates sufficient memory to hold the largest member. The largest member of union data is var2 so the sizeof() operator returns 8 bytes which is then printed using the printf() statement.

Initializing Union Variable #

In the above program, we have seen how we can initialize individual members of a union variable. We can also initialize the union variable at the time of declaration, but there is a limitation. Since union share same memory all the members can't hold the values simultaneously. So we can only initialize one of the members of the union at the time of declaration and this privilege goes to the first member. For example:

1
2
3
4
5
6
7
8
union data
{
    int var1;
    double var2;
    char var3;
};

union data j = {10};

This statement initializes the union variable j or in other words, it initializes only the first member of the union variable j.

Designated initializer #

Designated initializer allows us to set the value of a member other than the first member of the union. Let's say we want to initialize the var2 member of union data at the time of declaration. Here is how we can do it.

union data k = {.var2 = 9.14 };

This will set the value of var2 to 9.14. Similarly, we can initialize the value of the third member at the time of declaration.

union data k = { .var3 = 'a' };

The following program demonstrates the difference between a structure and a pointer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include<stdio.h>
/*
union is defined above all functions so it is global.
*/

struct s
{
    int var1;
    double var2;
    char var3;
};

union u
{
    int var1;
    double var2;
    char var3;
};

int main()
{
    struct s a;
    union u b;

    printf("Information about structure variable \n\n");

    printf("Address variable of a = %u\n", &a);
    printf("Size of variable of a = %d\n", sizeof(a));

    printf("Address of 1st member i.e var1 = %u\n", &a.var1);
    printf("Address of 2nd member i.e var2 = %u\n", &a.var2);
    printf("Address of 3rd member i.e var3 = %u\n", &a.var3);

    printf("\n");

    printf("Information about union variable \n\n");

    printf("Address of variable of b = %u\n", &b);
    printf("Size of variable of b = %d\n", sizeof(b));

    printf("Address of 1st member i.e var1 = %u\n", &b.var1);
    printf("Address of 2nd member i.e var2 = %u\n", &b.var2);
    printf("Address of 3rd member i.e var3 = %u\n", &b.var3);
    printf("\n\n");

    return 0;
}

Expected Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
Address variable of a = 2686728
Size of variable of a = 24
Address of 1st member i.e var1 = 2686728
Address of 2nd member i.e var2 = 2686736
Address of 3rd member i.e var3 = 2686744

Information about union variable

Address of variable of b = 2686720
Size of variable of b = 8
Address of 1st member i.e var1 = 2686720
Address of 2nd member i.e var2 = 2686720
Address of 3rd member i.e var3 = 2686720

How it works:

In lines 6-11, a structure of type s is declared with three members namely var1 of type int, var2 of type float and var3 of type char.

In line 13-18, a union of type u is declared with three members namely var1 of type int, var2 of type float and var3 of type char.

In line 22 and 23 declares a structure variable a of type struct s and union variable b of type union u respectively.

In line 27, the address of structure variable a is printed using & operator.

In line 28, the size of structure variable is printed using sizeof() operator.

Similarly the printf() statements in line 38 and 39 prints address and size of union variable b respectively.

All the member of a union shares the same memory that's why the next three printf() statements prints the same address.

Notice that the members of the union share the same address while the members of structure don't. The difference in size of structure and union variable also suggests that in some cases union may provide a more economical use of memory. Another important point I want to emphasise is that size of the structure may be greater than the sum of members due to the boundary alignment discussed earlier, the same thing is true for unions.

A structure can be a member one of the union. Similarly, a union can be a member of the structure.

Let's now shift our attention back to the problem we discussed while introducing unions.

After learning about unions we know that at a time only one member of union variable will be usable, that means the union is perfect for defining quantity. So instead if storing different quantity as members of structure why not create a union of a quantity that way for any goods only one member of the union will be usable.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
struct goods
{
    char name[20];

    union quantity
    {
        int count;
        float weight;
        float volume;
    } quant;
} g;

Instead of nesting union quantity we can define it outside the goods structure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
union quantity
{
    int count;
    float weight;
    float volume;
};

struct goods
{
    char name[20];
    union quantity quant;
} g;

If we want to access the value of count we can write:

g.quant.count

Similarly to access the value of weight we can write:

g.quant.weight

The following program demonstrates how we can use a union as a member of the structure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include<stdio.h>

/*
union is defined above all functions so it is global.
*/

union quantity
{
    int count;
    float weight;
    float volume;
};

struct goods
{
    char name[20];
    union quantity q;
};

int main()
{
    struct goods g1 = { "apple", {.weight=2.5} };
    struct goods g2 = { "balls", {.count=100} };

    printf("Goods name: %s\n", g1.name);
    printf("Goods quantity: %.2f\n\n", g1.q.weight);

    printf("Goods name: %s\n", g2.name);
    printf("Goods quantity: %d\n\n", g2.q.count);

    return 0;
}

Expected Output:

1
2
3
4
5
Goods name: apple
Goods quantity: 2.50

Goods name: balls
Goods quantity: 100

How it works:

In lines 7-12, a union quantity is declared with three members namely count of type int, weight of type float and volume of type float.

In lines 14-18, structure goods is declared with 2 members namely name which is an array of characters and w of type union quantity.

In line 22, structure variable g1 is declared and initialized. The important thing to note how designated initializer is used to initialize the weight member of the union. If we would have wanted to initialize the first element, we would have done it like this:

struct goods g1 = { "apple", {112} };

or

struct goods g1 = { "apple", 112 };

In line 23, structure variable g2 is declared and initialized.

In line 25 and 26, name and weight of the first goods is printed using printf() statement.

Similarly in line 28 and 29, name and weight of the second goods is printed using printf() statement.