Awasu » Git Guts: git logs
Friday 7th January 2022 9:21 PM

We saw in the previous section how git records which commit each ref is pointing to. Each time a ref is changed to point to a new commit, a record of that change is made, so that each ref has a history of which commits it has pointed to. There's more information about this here.

These can be useful sometimes e.g. when a commit becomes unreachable and orphaned[1]Perhaps because you reset HEAD incorrectly., and while it might look like work has been lost, the objects are still there in the repo[2]Orphaned objects will be kept for some time, until they get cleaned up during garbage collection., and by examining the reflog, you can reset a ref to bring those objects back into play.

Logs are stored in .git/logs/, one file per ref, and are straight-forward.

The object names are before and after values, a value of all 0's means that there is no value, the message after the timestamp appears to be optional.

This is some code that locates each log file:

    def read_logs():
        """Find and read logs in a repo."""

        # look for log files
        logs = []
        dname = ".git/logs"
        for root, dirs, files in os.walk( dname ): #pylint: disable=unused-variable
            for fname in sorted( files ):
                # found one - extract the ref name
                fname = os.path.join( root, fname )
                assert fname.startswith( dname )
                ref = fname[ len(dname)+1 : ].replace( os.sep, "/" )
                # read the entries in the log
                entries = _read_log_file( fname )
                logs.append( ( ref, entries ) )

        return logs

And some code that parses a single log file:

    def _read_log_file( fname ):
        """Read entries from the specified log file."""

        with open( fname, "r", encoding="utf-8" ) as fp:

            entries = []
            while True:

                # read the next line
                line = fp.readline().rstrip()
                if line == "":
                    break

                # parse the next entry
                mo = re.search( r"^([0-9a-f]{40}) ([0-9a-f]{40}) (.+?) \<(.+?)\> (\d+) ([+-]\d{4})(\s+[^:]+)?", line )
                if not mo:
                    print( "ERROR: Couldn't parse log line:" )
                    print( "  {}".format( line ) )
                    sys.exit( 1 )
                prev_ref, next_ref, user_name, user_email, tstamp, tzone, entry_type = mo.groups()
                msg = line[mo.end()+2:] if mo.end() < len( line ) else None # nb: this is optional

                # save the entry
                entries.append( {
                    "type": entry_type.strip() if entry_type else entry_type,
                    "refs": (
                        None if prev_ref == "0"*40 else prev_ref,
                        None if next_ref == "0"*40 else next_ref
                    ),
                    "user_name": user_name, "user_email": user_email,
                    "timestamp": parse_timestamp( tstamp, tzone ),
                    "msg": msg
                } )

        return entries

Source code

A new script logs.py will dump the logs in a repo.



References

References
1 Perhaps because you reset HEAD incorrectly.
2 Orphaned objects will be kept for some time, until they get cleaned up during garbage collection.
Have your say