How special is the global variable _G?

legends2k picture legends2k · Mar 10, 2016 · Viewed 11.2k times · Source

Excerpt from Lua 5.3 manual:

_G

A global variable (not a function) that holds the global environment (see §2.2). Lua itself does not use this variable; changing its value does not affect any environment, nor vice versa.

Relevant part from §2.2

[…] every chunk is compiled in the scope of an external local variable named _ENV, so _ENV itself is never a free name in a chunk.

[…]

Any table used as the value of _ENV is called an environment.

Lua keeps a distinguished environment called the global environment. This value is kept at a special index in the C registry. In Lua, the global variable _G is initialized with this same value. (_G is never used internally.)

When Lua loads a chunk, the default value for its _ENV upvalue is the global environment. Therefore, by default, free names in Lua code refer to entries in the global environment

I understand that for every chunk loaded, since _ENV would be the first upvalue, it is pointed to the global environment table, pointed by _G by load.

> =_G, _ENV
table: 006d1bd8 table: 006d1bd8

confirms that both point to the same table. The manual states, rather reassures multiple times, that _ENV and _G are just regular names with no hidden meaning and that Lua itself doesn't use it internally. I tried this chunk below:

local a = { }
local b = a      -- since tables are objects, both refer to the same table object
print(a, b)      -- same address printed twice
a = { }          -- point one of them to a newly constructed table
print(a, b)      -- new, old table addresses printed

Now doing the same with _G and _ENV:

local g = _G          -- make an additional reference
print(g, _G, _ENV)    -- prints same address thrice
local p = print       -- backup print for later use
_ENV = { }            -- point _ENV to a new table/environment
p(g, _G, _ENV)        -- old, nil, new

table: 00ce1be0    table: 00ce1be0    table: 00ce1be0
table: 00ce1be0    nil                table: 00ce96e0

If _G is an ordinary global, why is it becoming nil here? If reference counting is done, _G, was still holding a reference at the time _ENV released it. Like b above, it too should be holding on to the old table, no?

However, for the below chunk, _G is unchanged / preserved!

_ENV = { _G = _G }
_G.print(_G, _ENV, _ENV._G)   -- old, new, old

But here it is killed:

_ENV = { g = _G }
_ENV.g.print(_ENV, _ENV.g, _G)    -- new, old, nil

Another case where it is preserved:

print(_G, _ENV)                       -- print same address twice
local newgt = {}                      -- create new environment
setmetatable(newgt, {__index = _G})   -- set metatable with _G as __index metamethod
_ENV = newgt                          -- point _ENV to newgt
print(_G, newgt, _ENV)                -- old, new, new

With so many variations in the behaviour of _G, the original reassurance given by the manual seems shaky. What am I missing here?

Answer

siffiejoe picture siffiejoe · Mar 14, 2016

How special is the global variable _G?

It is special in three ways:

  1. It uses a name reserved for internal use by Lua.
  2. It is create by one of Lua's standard modules (in particular the "base" module). If you create a fresh lua_State without opening the "base" module, you won't have the _G variable. The standalone interpreter has all standard libraries already loaded, though.
  3. Some third-party Lua modules use the global variable _G, and changing/removing it can break those modules.

What's the point of _G?

Global variables in Lua are implemented using a normal table. Any access to a variable that is not a local variable or an upvalue will be redirected to this table. Local variables always take priority, so if you have a global variable and a local variable with the same name, you will always get the local one. And here _G comes into play: If you want the global variable, you can say _G.name instead of name. Assuming the name _G is not a local variable (it's reserved for Lua, remember?!), this will always get you the value of the global variable by using table indexing syntax and thus removing the ambiguity with local variable names. In newer Lua versions (5.2+) you could also use _ENV.name as an alternative, but _G predates those versions and is kept for compatibility.

There are other cases where you want to get a hold of the globals table, e.g. for setting a metatable. Lua allows you to customize the behavior of tables (and other values) by setting a metatable using the setmetatable function, but you have to pass the table as a parameter somehow. _G helps you do that.

If you have added a metatable to the globals table, in certain cases you might want to circumvent the metamethods (__index and/or __newindex) you've just installed. You can use rawget and rawset, but you need to pass the globals table as a parameter as well.

Note that all the use-cases listed above only apply to Lua code not C code. In C code you don't have named local variables, only stack indices. So there is no ambiguity. And if you want a reference of the globals table to pass to some function, you can use lua_pushglobaltable (which uses the registry instead of _G). As a consequence, modules implemented in C don't use/need the _G global variable. This applies to Lua's standard library (which is implemented in C) as well. In fact, the references manual guarantees, that _G (the variable, not the table) is not used by Lua or its standard library.

How does _G relate to _ENV?

Since version 5.0 Lua allows you to change the table used to look up global variables on a per-(Lua-)function basis. In Lua 5.0 and 5.1 you used the setfenv function for that (the globals table is also called "function environment", hence the name setfenv). Lua 5.2 introduced a new approach using another special variable name _ENV. _ENV is not a global variable though, Lua makes sure that every chunk starts with an _ENV upvalue. The new approach works by letting Lua translate any access to a non-local (and non-upvalue) variable name a to _ENV.a. Whatever _ENV is at that point in the code gets used to resolve global variables. This way is a lot safer because you can't change the environment of code you didn't write yourself (without using the debug library), and also more flexible because you can change the environment for individual blocks of code by creating local _ENV variables with limited scopes.

However, in any case you need a default environment that is used before the programmer has a chance to set a custom one (or if the programmer does not want to change it). On startup, Lua creates this default environment (also called "global environment") for you and stores it in the registry. This default environment is used as the _ENV upvalue for all chunks unless you pass custom environments to load or loadfile. lua_pushglobaltable also retrieves this global environment directly from the registry, so all C modules automatically use it for accessing global variables.

And if the standard "base" C module has been loaded, this default "global environment" has a table field called _G that refers back to the global environment.

To sum it up:

  • The global variable _G is actually _ENV._G.
  • _ENV is not a global, but an upvalue or a local variable.
  • The _G field of the default "global environment" points back to the global environment.
  • _G and _ENV by default refer to the same table (said global environment).
  • C code doesn't use either, but a field in the registry (which again points to the global environment by definition).
  • You can replace _G (in the global environment) without breaking C modules or Lua itself (but you might break third-party Lua modules if not careful).
  • You can replace _ENV whenever you want, because it only affects your own code (the current chunk/file at most).
  • If you replace _ENV, you can decide for yourself whether _G (_ENV._G) will be available in the affected code, and what it points to.