Why does an array index start at 0?


#1

It can be confusing to someone new to arrays. Why is the first element not 1?


#2

It is my understanding that the index starts at zero due to the nature of implementation.

An array is usually a reference type, that is the value or the variable is a pointer to a memory location.

So the index 0 points to the first section of memory for that underlying array, then the index 1 points to the memory location incremented by 1 of the data width for the array’s type.

Example in #go:

    var numbers []int64
    numbers[0] = 8887
    numbers[1] = 20

In this example the array is of 64 bit integers. So index 1 would be the array’s memory address incremented by 64 bits.


#3

It’s helpful to think about an array as a block of memory. So when you see something like [1, 2, 3, 4] (or other syntax, depending on your language of choice), in memory it’s something like…

memory address       value stored there
0x88880000              1
0x88880004              2
0x88880008              3
0x8888000c              4

The array is just a reference to that place in memory (i.e., 0x88880000). Great, but how do we get to the stuff in that memory? We look at some offset, or index. The first thing in that array is right where the array begins, which has an offset of 0. The next has an index of 1, so an offset of 1 (or, rather, 1 times the size of the thing being stored in the array).

This, I think, goes back to the days of C, where you were just getting some memory, and throwing things in there. Modern languages have all sorts of other stuff that gets allocated (length, allocation size, etc.), so there’s probably some other memory offset that gets added in, but we still think about it as “how many elements away from the beginning”. The first thing is 0 elements away from the beginning.


#4

Interesting aside, not all languages use 0-based indexes. I once worked on a project that used pre-.Net Visual Basic and Visual C++ at the same time. Our VB projects were set up to be 1s-based (which was an language option), and the C++ projects were 0s based. The VB project used loops that looked something like…

Dim MyArray(9)
For i = 1 To 9
  MyArray(i) = i
Next

#5

Cool, that’s interesting. I wasn’t sure if it was all languages or not. I used to write basic on a Commodore but don’t remember much of the syntax.


#6

Dijkstra wrote an influential article on this which can be found replicated here: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

The basic idea behind his reasoning is that that half open intervals based at zero become the only natural notation when representing some ranges.

As an aside, in C and C++, E1[E2] (when not overloaded) is exactly the same as *((E1) + (E2)), so things like 0["hello"] are valid.


#7

In maths a 0 is considered an integer


#8

That’s super weird, at least coming from Go.


#9

@Robbie It’s weird for us too; more a curiosity of how the language is defined rather than something you’d actually use :smiley:


#10

@TartanLlama I think it’s half the fun of knowing or learning different languages. I’m learning Ruby now and it’s quite weird.


#11

Would that be dependent on the size of the indexer and array elements?


#12

@object88 the pointer arithmetic thing? Yeah, incrementing the pointer by one will add sizeof(pointee_type).


#13

They also start at 1 in Lua and zsh, if anyone’s curious. No idea why. Seems arbitrary.


#14

Language designers can make that decision pretty much arbitrarily, but for Lua I have the feeling that they made the choice to go for 1-indexing because of the ease of learning people that are completely new to programming how arrays work.
People new to programming are often puzzled by how array[size] is past the end of the array in most languages and how 0 is the 1st element, 1 the 2nd etc.


#15

That makes sense. I learned with Python, so I just suffered through the 0 thing. I think that 1 would have been easier, now that I think about it.


#16

I started with TI-BASIC, which had 1-index, I think? Then went on to program in C++ and quickly learned to see 0 as natural because of the reasons given in the above comments by @object88 and @TartanLlama.


#17

It is the offset from the start.

start + 0 = first record


#18

Think in terms of the “20th Century” being the years in the “1900’s”.
Same principle.


#20

With regards to Lua specifically, you can actually start an array at index 0, 1, or any other value (although it is customary to start arrays with index 1)

-- creates an array with indices from -5 to 5
a = {}
for i=-5, 5 do
    a[i] = 0
end

The 1-based array and string indexing apparently was inherited from Sol (one of the languages that Lua is descended from) which was a language designed for engineers with no formal training in computer programming, where starting at one seems more natural.


#21

So strange, I wonder what a good use case would be. I guess the interpreter does the heavy lifting and assigns the lowest number in the range to the first index.