Why don't languages like Ruby and Python have direct pointers?

ruby
python

#1

Why don’t newer languages like Ruby and Python have direct pointers, similar to what C++ has?


#2

Went and did some research online, and found some information I’d like to share, for anyone that comes from a C++ beginner programming background like me, and is wondering why newer languages (relative to others like C++) have done away with direct pointers. This might also help frame, in a more general sense, why newer high level languages have implemented certain levels of abstraction:

It seems that most modern programming languages have done away with direct
pointers because they would lead to many problems in past programming
languages. Things like null pointers, pointer out of bounds, etc.
were all big problems with older languages like C++.

Another reason for this is that apparently to get rid of direct memory access.
Newer programming languages are “managed memory” languages - rather than you having to do things like allocating memory for large structures and then releasing the
memory, the newer languages take care of creating the necessary memory when new
objects are created, and releasing the memory when the program is no
longer using the objects. This abstraction process is called “Garbage
Collection.” From this, you can deduce that managed memory languages are a big improvement to
programmer productivity. According to something I read on the internet (don’t quote me on this number), the amount of time spent finding pointer errors in C was
as much as 30% of the maintenance time on software projects. When Garbage Collection became a hot thing, that time was almost completely eliminated.

Python and Ruby are all examples of managed memory
languages (and apparently Java and Perl are as well - I haven’t messed with those much, so I’m not 100% sure on that).

If I’m wrong on any of this, please do comment on that!


#3

Instead of pointers, think references. With pointers you have access to pointer arithmetic and casting integer values into pointers. With references, you get most of the same dynamic behavior of pointers, but without the segfaults.

The process of binding references to names along with a few fundamental data structures like array/list and hash/dictionary/object is enough to facilitate rich data structures and algorithms.

Of course, the language interpreter/engine underneath has to do all the memory management and it uses pointers. That code is usually C and often does something like reference counting garbage collection or both. Fundamental data structures like array/list are implemented as something like a vector (growable array) of object references. Likewise dictionary or object types are implemented as hash tables where the values (and maybe the keys) are implemented as object references.

Outside of a browser, these languages always have some interface to external code written in C. This is where hardcore pointer work takes place. An excellent example of this is the numpy library for Python, which gives powerful and memory efficient matrix operations.

(Javascript is another good example of this kind of thing, but my favorite is Python, since it leaves a more transparent relationship between the two worlds. Inside Python, almost everything is a dictionary binding names to object references. Inside C, almost everything is a reference-counted pointer to a generic object type.)


#4

I wouldn’t say “newer” here. Swift, Rust, and Go are much newer than Ruby and Python, but they do have pointers. And C++ include numerous features to “hide” raw pointers in a lot of common usage.

But you’re definitely onto something that the model of thinking of everything in terms of raw memory has fallen out of favor, at least for day-to-day usage, and for good reason. As you’ve discovered in your research, manual memory management is a prolific source of difficult bugs.

Garbage collection is almost always easier for programmers to work with, but it has some major problems when available memory and computing power is limited, particularly on mobile and embedded platforms. A common Android issue is figuring out how to trick the garbage collector into doing what you want so your program doesn’t stutter. On iOS, Swift doesn’t use garbage collection at all. It uses another type of automatic memory management called ARC (Automatic Reference Counting).

Raw memory can be dramatically faster than managed memory, and many languages (including Python and Ruby) include ways to write in C or C++ when you need that power. But a good garbage collector can sometimes be faster and more memory efficient than what you’d write by hand. (Not better than what could in theory be written by hand, but better than what most programmers would actually write, even experienced ones.)

It feels unlikely that a major new language would come along that didn’t have some kind of managed memory as its default. It’s one of those things language designers have found to be very useful. But I think you’ll find that many new languages will also continue to be created that offer pointers when you want them.


#5

Just to add to this Garbage Collection and Swift/ObjC style Automatic Reference Counting (ARC) are quite different, ARC is a compile time feature whereas garbage collection is a runtime feature. The result of this is that in Swift/Objc disjoint graphs cannot be detected (i.e., you can get retain cycles, and they won’t get cleared up). To avoid this the user needs to declare some variables as weak, which tells the compiler not to increment the retain count for that reference. Swift/ObjC developers haven’t got away from memory management, in many ways it’s just the difficult problems that are left.