Understanding how Python counts

2 Comments

I started to learn Python again and this time I hope to have a good feeling about finishing it. This time instead of doing it alone, I joined “A Gentle Introduction to Python” MOOC-E (or Mookie, as I lovingly call it)1. After the first week it seems that it offers just the right flexibility and support to keep me going.

But that’s not the reason why I’m writing this post. I never understood how Python counts2 and I think I’ve cracked it now.

My brother (the physicist) explained to me that counting from 0 onwards is normal everywhere where serious mathematics is involved. OK, I understand that, it makes sense.

Let’s store a string with 5 characters and call its first and last character.

>>> a = "MONTY"
>>> print(a[0])
M
>>> print(a[4])
Y

The above explains just fine that print(a[0]) produces M and that if I want to get the last character (Y) I have to call print(a[4]).

But this logic is lacking if you try to understand why a 5 characters long string has a range of print(a[0:5]) although, as we saw, the last character is № 4. And even less if you use negative numbers like a(print[-3]). While in the same 5-character string the first character is № 0 and the last character is № 4; in the negative direction the last character is № -1 and the last character № -5.

>>> print(a[0:5])
MONTY
>>> print(a[0:4])
MONT
>>> print(a[-5])
M
>>> print(a[-1])
Y

So, my therory is3 that the numbers don’t represent the drawers or slots in which the characters (or other items, objects) are placed, but the addresses from which on the data is to be read — the first border of each drawer, if you will. Also I think the data is always read in the positive direction. So even when using negative numbers to call objects, it will take the first one on the right of the address.

This is best represented in a table. Please note that the pipe characters (|) are there only as a visual aid to represent borders.

|M|O|N|T|Y|
012345
-5-4-3-2-1None

At the same time as why negative numbers are different, this also explains how calling scquences works — again both using positive and negative numbers in sequences.

Using the above analogy, for the following code…

>>> print(a[1:3])
ON
>>> print(a[-4:-2])
ON
>>> print(a[1:-2])
ON

…the visual representation would be:

|M|O|N|T|Y|
012345
-5-4-3-2-1None

As you can see, this logic works for both positive and negative numbers as well as mixing the two in sequences.

The only issue that I still haven’t figured out is how do you call a sequence using negative numbers that includes the last character. By analogy the last limit should be -0, but it’s not (probably due -0 = 0). Just ommitting the second argument in the range works, but it still baffles me as this is essentially the same as calling None and it then reaches to the very end.

>>> print(a[-5:-0])

>>> a[-5:-1]
'MONT'
>>> a[-5:]
'MONTY'

Update: added new finding that you cannot reach the last item in a sequence with negative numbers.

hook out → looking forward to week 2 of Mookie’s Python tutorial

  1. MOOC-E or Mechanical MOOC is a new concept of P2PU how to run a fully automated online course. “A Gentle Introduction to Python” is a course that uses videos and materials from MIT OpenCourseWare, the exercises on Codecademy and the OpenStudy platform as well as mailing lists for discussions.
  2. I assume its the same in all computer languages, but I never got as far with others.
  3. And if I am terribly wrong here — which is a possibility — please let me know.

Paul Boddie

With “a[0:5]”, all you really need to know is that the second number is one position beyond the last position you want to visit. In other words, you count from 0 up to but not including 5. You can use Python’s range function to show what “0:5” means by asking for “range(0, 5)”:

range(0, 5) [0, 1, 2, 3, 4]

So, that just tells you which positions will be chosen when asking for the slice that is “a[0:5]”. The reason why things are done this way is largely due to the way that you can take the difference of the numbers and know how many positions are involved.

However, the “negative slicing” case is quite interesting and not something I’d really thought too much about before. Clearly, “a[-5:0]” or “a[-5:-0]” makes sense, but this won’t work because -5 and 0 effectively refer to the same position. And if the start and end refer to the same position, then you get an empty string.

That’s the extra step I think you’re missing: the start and end index numbers you choose are effectively converted to the positive range of numbers allowed by the string. So, -5 is converted to 0, meaning that “a[-5:0]” is effectively converted to “a[0:0]”, which is an empty string. In other words, the negative numbers are just for convenience and have to be converted into “proper” positions.

It’s fascinating to see how you have constructed a mental model and how it evolves, and it should make everyone question the design decisions involved in forming the language and their own understanding of them.

2012-10-22 9:51 pm

hook

Thank you for the explanations and the kind words.

I actually saw (positive) slicing in a few Python learning tutorials, so I only extended the same logic to the negative numbers.

The spark for this line of thought was that while flashing my DreamPlug I noticed that you call the first address of the first bit of memory you want operate with.

2012-10-23 6:51 pm