A Fresh Look at Arrays

If you've been programming for a little while you will have come across arrays. An array is a type of linear data structure. By linear we mean that it has one dimension (i.e. it can be visualised as a line).

Arrays are also ordered, by which we mean that they have a fixed order. In low-level languages such as C or Rust, an array is a contiguous block of memory of one type. They have a fixed size.

In C we explicitly write new arrays this way:

// declare an array (block of memory) of length 4
// where int is 32 bytes, this is 128 bytes 
int myArray[4] = {2,4,6,8};

To access an element of an array we use an index, which is a number indicating the position we're interested in.

int myArray[5] = {1,2,3,4,5};
printf("Value is %d", myArray[2]);
// output "Value is 3"

Let's consider some different operations on this type of array. Note that the arrays we're discussing are a little different to the data structures called "arrays" in languages like Javascript, which are far more complex.

There are 4 main operations on data structures: insert, delete, find, and update.

Insert is adding a new element to our data structure.

Delete is removing an element from our data structure by value or position.

Find is finding an element by value or position within the data structure.

Update is finding an element in a data structure and then updating its value.

Questions

  1. Given the definitions above, what do you think the best and worst-case time complexity of finding an element in an array is? Why?
  2. What's the best and worst case for updating an array element? Why?
  3. What's the best and worst case for inserting an element into an array, given low-level arrays have a fixed size?

Insert & Delete: Theory and Practice

Let's look at the operations we need to do to delete an element from an array.

Here is an array:

We want to remove the element at index 2 (i.e. with value 3).

We have an empty space in the middle of our array! Because the array is of fixed size we have no easy method to remedy this.

What we need to do is declare a smaller array.

We then go through each value in the array in turn and copy it across to the smaller array, skipping over the deleted element.

Finally, we delete the previous array.

What's the time complexity of this?

In theory, since we need to copy every element from the old array to the new array, it has time complexity O(n). However, in practice most modern hardware has optimisations for memory copies. This means real-world performance is often much better than O(n) complexity would suggest, and closer to O(1).

A real C compiler will optimise away the difference between these two methods but we can see some of the reasoning behind it in Python libraries such as numpy. numpy gives Python developers access to low-level arrays rather than the more complex list type offered by Python, and therefore gives a big speed boost to complex math operations across lists of numbers.

This is because numpy arrays are contiguous blocks of memory, which computer hardware is very well optimised to deal with, while Python lists are not.

Problems

  1. Inserting into an array follows the same algorithm as deleting from one, but reversed. Draw a series of diagrams to demonstrate the steps required to insert into an array.