windows registry files

the author of this document will not be responsible for any damage and/or
license violation that may occur. the information within this document is
provided "as is" without warranty of any kind...
this information was "collected" during sleepless nights, and is not
officially released by microsoft! it shall give you a peek at the windows(tm)
internals to give you a chance to recover from corrupted data.
the author has nothing to do with microsoft, except that he uses their
if you don't agree with this, stop reading this document, and delete it at
what is the registry? where did it came from? two questions, which i will try to
answer here. the registry is a database (at least microsoft thinks so:)
which contains configuration information about the system.
it mainly is a memory dump which is saved to one or more files on the windows
host drive. it is loaded every system-boot and remains resident until
shutdown. since parts of it are not used during normal operation it will be 
swapped out very soon. the registry appeared with windows 3.?? (sorry, i can't 
remember any earlier version :-), where it was used for file associations and 
the "ole" functions (the conection between ole-id's and the applications).
this is a critical information and since the registry has (almost) no
checksum information (!), it sometimes gets corrupted. this is the main
reason for this doc.
using windows 3.x, almost every configuration was done using good old ".ini"-
files, which were readable but slow and limited in size (64k). in windows 95
(and nt), the registry was used instead of these files. so, to edit a
particular setting, you would have to run the application which manages these
settings. :( but what if this app won't start? ms included a tool named
regedit in windows 3.?? and 95, and a regedt32 in windows nt. you can use
these apps to edit all contents of the registry (in windows nt the registry
supports security, as well as it provides the security for the whole system!)
an application can open a "key", write values (variables) to it and fill them
with data. each key represents also a value called "default" and can contain
any number of sub-keys. this will form a tree-structure as you can see at
the left half of regedit. (note: regedit from windows 3.?? has to be started
with /v or /y, i can't remember now)
where can i find the registry???
that differs for each windows-version:
version  file(s)                 contents
3.1x     reg.dat                 complete windows 3.?? registry
95       system.dat              system-values (hkey_local_machine)
         user.dat                user-values (hkey_users)
nt       system32\config\sam     sam-part of the registry (=nt security)
         system32\config\software software-specific part
         system32\config\system  system-specific part
         profiles\%username%\ntuser.dat  user-specific part
         profiles\%username%\  like ntuser.dat but a 
if you are using a roaming-profile with windows nt, can be on
a network-share as well...

the registry consists of the following elements:
        hive:   strating point of the structure. the name of an hive starts
                with the "hkey_"-prefix. can be seen as a "drive" in a file
hive name               beschreibung                   3.1     95      nt4
hkey_classes_root       points to the "class" key in
                        the "hkey_local_machine" hive,
                        the only hive in windows 3.??   x       x       x
hkey_current_user       information and settings valid
                        for the currently logged in
                        user. (points to the correct            x       x
                        key under "hkey_users")
hkey_current_config     settings for the currently
                        active hardware profile.
                        points to "hkey_local_machine\          x       x
hkey_users              contains all currently active
                        user settings. since nt is a
                        single user system, there
                        will be only one key (the s-id          x       x
                        of the active user), and a
                        ".defualt" key (the settings
                        for the ctrl-alt-del environment)
hkey_localmachine       all local settings                      x       x
hkey_dyn_data           as the name says, here you'll find      x
                        dynamic data (cpu-usage,...)
        key:    a key to the registry can be seen as a directory in a file
        value:  can be seen as the registrys "file"
        data:   is the actual setting, can be seen as the contents of a
windows 3.x
this registry is the easiest one. it consists of 3 blocks, which are not
"signed" at all:
block                   position        size
header                  0               32 bytes
navigation-info         0x00000020      ???
data-block              ???             ???
the "???" marked values can be read from the header.
offset  size    description
0x0000  8 byte  ascii-text: "shcc3.10"
0x0008  d-word  ?
0x000c  d-word  ? (always equal the d-word at 0x0008)
0x0010  d-word  number of entrys in the navigation-block
0x0014  d-word  offset of the data-block
0x0018  d-word  size of the data-block
0x001c  word    ?
0x001e  word    ?
values marked "?" are not important for a read-access, and therefore unknown
to me...
this is where chaos rules! it consists of two different, 8 byte long blocks:
        * navigation-info-record,
        * text-info-record
the first record in the navigation block is a navigation info record.
offset  size    contents
0x00    word    next key (same level)
0x02    word    first sub-key (one level deeper)
0x04    word    text-info-record key-namens
0x06    word    text-info-record key-value (default)
the values are the locical number of the block inside the file:
since 2 of this values are constant:
offset  size    contents
0x00    word    ?
0x02    word    number of references to this text
0x04    word    text-length
0x06    word    offset of the text-string inside the data-block
to get the text-offset inside the file you have to add this offset to the
data-offset inside the header.
the data-block only consists of a collection of text-strings. right in front
of every text is a word which may or may not have a meaning. the offset in
the text-info record points directly to the text, the text-size has to be
defined in the text-info record too.
windows 95
the windows95-registry files:
inside the windows-directory (default: c:\windows) are 2 files which are
loaded to form the registry:
this files are mapped to the following hives:
	hkey_local_machine in system.dat
	hkey_users in user.dat

the file structure:
both files have the same structure. each of them consists of 3 blocks where
1 of these blocks can be repeated.
every block has a 4 byte long signature to help identify its contents.
id      block-contents          max. size               
creg    header                  32 bytes @ offset 0    
rgkn    directory information
        (tree-structure)        ??? @ offset 32
rgdb    the real data
        (values and data)       max. 65535 bytes an offset ??
these blocks are "sticked together" with no space between them, but always
a multiple of 16 in size.
the creg-block
offset          size            inhalt
0x00000000      d-word          ascii-"creg" = 0x47455243
0x00000008      d-word          offset of 1st rgdb-block
0x00000010      d-word          # of rgdb-blocks
all other values are not needed to read the registry...
the rgkn-block
i assume that rgkn stands for registry-key-navigation. this block contains
the information needed to built the tree-structure of the registry. this
block will be larger then 65536 bytes (0xffff)!
all offset-values are relative to the rgkn-block!
offset          size    contents
0x00000000      d-word  ascii-"rgkn" = 0x4e4b4752
0x00000004      d-word  size of the rgkn-block in bytes
0x00000008      d-word  rel. offset of the root-record
0x00000020      ????    tree-records (often the 1st record)
the tree-record
the tree-record is a "complete" registry-key. it contains the "hash"-info
for the real data stored in this key.
offset  size    contents
0x0000  d-word  always 0
0x0004  d-word  hash of the key-name
0x0008  d-word  always -1 (0xffffffff)
0x000c  d-word  offset of the owner (parent)-records
0x0010  d-word  offset of the 1st sub-sey record
0x0014  d-word  offset of the next record in this level
0x0018  d-word  id-number of the real key
the 1st entry in a "usual" registry file is a nul-entry with subkeys: the
hive itself. it looks the same like other keys. even the id-number can
be any value.
the "hash"-value is a value representing the key's name. windows will not
search for the name, but for a matching hash-value. if it finds one, it
will compare the actual string info, otherwise continue with the next key.
end of list-pointers are filled with -1 (0xffffffff)
the id-field has the following format:
        bits 31..16:    number of the corresponding rgdb-blocks
        bits 15..0:     continuous number inside this rgdb-block.

the hash-method:
you are looking for the key:    software\microsoft
first you take the first part of the string and convert it to upper case
the "\" is used as a seperator only and has no meaning here.
next you initialize a d-word with 0 and add all ascii-values of the string
which are smaller than 0x80 (128) to this d-word.
        software = 0x0000026b
now you can start looking for this hash-value in the tree-record.
if you want to modify key names, also modify the hash-values, since they
cannot be found again (although they would be displayed in regedit)
the rgdb-block
offset  size    contents
0x0000  d-word  ascii-"rgdb" = 0x42444752
0x0004  d-word  size of this rgdb-block
0x0020  ????    rgdb records
rgdb-record (key-information)
offset  size    contents
0x0000  d-word  record length in bytes
0x0004  d-word  id-number
0x0008  d-word  ??? size ???
0x000c  word    text length of key name
0x000e  word    number of values inside this key
0x0010  d-word  always 0
0x0014  ????    key-name
0x????  ????    values
the first size (record length) can be used to find the next record.
the second size value is only correct if the key has at least one value, 
otherwise it is a little lower.
the key-name is not 0-terminated, its length is defined by the key-
text length field. the values are stored as records.
offset	size	contents
0x0000	d-word	type of data
0x0004	d-word	always 0
0x0008	word	length of value-name
0x000a	word	length of value-data
0x000c	????	value-name
0x????	????	data
value		contents
0x00000001	regsz - 0-terminated string (sometimes without the 0!)
0x00000003	regbin - binary value (a simple data-block)
0x00000004	regdword - d-word (always 4 bytes in size)

windows nt (version 4.0)
whoever thought that the registry of windows 95 and windows nt are similar
will be surprised! they only look much the same, but have completely other
since the rgdb-blocks in the windows 95 registry are not larger than
0xffff, we can see that it is optimized for a 16-bit os...
windows nt stores its registry in a page-oriented format with blocks
of 4kb (4096 = 0x1000 bytes)
the windows nt registry has 2 different blocks, where one can occure many
the "regf"-block
"regf" is obviosly the abbreviation for "registry file". "regf" is the 
signature of the header-block which is always 4kb in size, although only
the first 64 bytes seem to be used and a checksum is calculated over
the first 0x200 bytes only!
offset		size	contents
0x00000000	d-word	id: ascii-"regf" = 0x66676572
0x00000004	d-word	????
0x00000008	d-word	???? always the same value as at 0x00000004
0x0000000c	q-word	last modify date in winnt date-format
0x00000014	d-word	1
0x00000018	d-word	3
0x0000001c	d-word	0
0x00000020	d-word	1
0x00000024	d-word	offset of 1st key record
0x00000028	d-word	size of the data-blocks (filesize-4kb)
0x0000002c	d-word	1
0x000001fc	d-word	sum of all d-words from 0x00000000 to 0x000001fb
i have analyzed more registry files (from multiple machines running
nt 4.0 german version) and could not find an explanation for the values
marked with ???? the rest of the first 4kb page is not important...
the "hbin"-block
i don't know what "hbin" stands for, but this block is always a multiple 
of 4kb in size.
inside these hbin-blocks the different records are placed. the memory-
management looks like a c-compiler heap management to me...
offset	size	contents
0x0000	d-word	id: ascii-"hbin" = 0x6e696268
0x0004	d-word	offset from the 1st hbin-block
0x0008	d-word	offset to the next hbin-block
0x001c	d-word	block-size
the values in 0x0008 and 0x001c should be the same, so i don't know
if they are correct or swapped...
from offset 0x0020 inside a hbin-block data is stored with the following
offset	size	contents
0x0000	d-word	data-block size
0x0004	????	data
if the size field is negative (bit 31 set), the corresponding block
is free and has a size of -blocksize!
the data is stored as one record per block. block size is a multiple
of 4 and the last block reaches the next hbin-block, leaving no room.
records in the hbin-blocks
	the nk-record can be treated as a kombination of tree-record and 
	key-record of the win 95 registry.
	the lf-record is the counterpart to the rgkn-record (the hash-function)
	the vk-record consists information to a single value.
	sk (? security key ?) is the acl of the registry.
	the value-lists contain information about which values are inside a
	sub-key and don't have a header.
	the datas of the registry are (like the value-list) stored without a 
all offset-values are relative to the first hbin-block and point to the block-
size field of the record-entry. to get the file offset, you have to add
the header size (4kb) and the size field (4 bytes)...
the nk-record
offset	size	contents
0x0000	word	id: ascii-"nk" = 0x6b6e
0x0002	word	for the root-key: 0x2c, otherwise 0x20
0x0004	q-word	write-date/time in windows nt notation
0x0010	d-word	offset of owner/parent key
0x0014	d-word	number of sub-keys
0x001c	d-word	offset of the sub-key lf-records
0x0024	d-word	number of values
0x0028	d-word	offset of the value-list
0x002c	d-word	offset of the sk-record
0x0030	d-word	offset of the class-name
0x0044	d-word	unused (data-trash)
0x0048	word	name-length
0x004a	word	class-name length
0x004c	????	key-name
the value-list
offset	size	contents
0x0000	d-word	offset 1st value
0x0004	d-word	offset 2nd value
0x????	d-word	offset nth value
to determine the number of values, you have to look at the
der vk-record
offset	size	contents
0x0000	word	id: ascii-"vk" = 0x6b76
0x0002	word	name length
0x0004	d-word	length of the data
0x0008	d-word	offset of data
0x000c	d-word	type of value
0x0010	word	flag
0x0012	word	unused (data-trash)
0x0014	????	name
if bit 0 of the flag-word is set, a name is present, otherwise the
value has no name (=default)
if the data-size is lower 5, the data-offset value is used to store
the data itself!
the data-types
wert	beteutung
0x0001	regsz: 		character string (in unicode!)
0x0002	expandsz: 	string with "%var%" expanding (unicode!)
0x0003	regbin:		raw-binary value
0x0004	regdword:	dword
0x0007	regmultisz:	multiple strings, seperated with 0
the "lf"-record
offset	size	contents
0x0000	word	id: ascii-"lf" = 0x666c
0x0002	word	number of keys
0x0004	????	hash-records
offset	size	contents
0x0000	d-word	offset of corresponding "nk"-record
0x0004	d-word	ascii: the first 4 characters of the key-name, 
		padded with 0's. case sensitiv!
keep in mind, that the value at 0x0004 is used for checking the
data-consistency! if you change the key-name you have to change the
hash-value too!
the "sk"-block
(due to the complexity of the sam-info, not clear jet)
offset	size	contents
0x0000	word	id: ascii-"sk" = 0x6b73
0x0002	word	unused
0x0004	d-word	offset of previous "sk"-record
0x0008	d-word	offset of next "sk"-record
0x000c	d-word	usage-counter
0x0010	d-word	size of "sk"-record in bytes
????	????	security and auditing settings...
the usage counter counts the number of references to this
"sk"-record. you can use one "sk"-record for the entire registry!
windows nt date/time format
the time-format is a 64-bit integer which is incremented every
0,0000001 seconds by 1 (i don't know how accurate it realy is!)
it starts with 0 at the 1st of january 1601 0:00! all values are
stored in gmt time! the time-zone is important to get the real

common values for win95 and win-nt
offset values marking an "end of list", are either 0 or -1 (0xffffffff).
if a value has no name (length=0, flag(bit 0)=0), it is treated as the
"default" entry...
if a value has no data (length=0), it is displayed as empty.

simplyfied win-3.?? registry:

| next rec. |---+			+----->	+------------+
| first sub |   |			|	| usage cnt. |
| name      |	|  +-->	+------------+	|	| length     |
| value     |	|  |	| next rec.  |	|	| text       |------->	+-------+
+-----------+	|  |	| name rec.  |--+	+------------+		| xxxxx |
   +------------+  |	| value rec. |-------->	+------------+		+-------+
   v		   |	+------------+		| usage cnt. |
+-----------+	   |				| length     |
| next rec. |	   |				| text       |------->	+-------+
| first sub |------+				+------------+		| xxxxx |
| name      |								+-------+
| value     |

greatly simplyfied structure of the nt-registry:
    v                                                                         |
+---------------+	+------------->	+-----------+  +------>	+---------+   |
| "nk"		|	|		| lf-rec.   |  |	| nk-rec. |   |
| id		|	|		| # of keys |  |	| parent  |---+
| date		|	|		| 1st key   |--+	| ....    |
| parent	|	|		+-----------+		+---------+
| suk-keys	|-------+
| values	|--------------------->	+----------+
| sk-rec.	|---------------+	| 1. value |--> +----------+
| class		|--+		|	+----------+	| vk-rec.  |
+---------------+  |		|			| ....     |
		   v		|			| data     |--> +-------+
		+------------+	|			+----------+	| xxxxx |
		| class name |	|					+-------+
		+------------+	|
		+---------+	+---------+
	+----->	| next sk |---> | next sk |--+
	|   +---| prev sk | <---| prev sk |  |
	|   |	| ....    |	| ...     |  |
	|   |	+---------+	+---------+  |
	|   |			 ^	     |
	|   +--------------------+           |
hope this helps....  (although it was "fun" for me to uncover this things,
			it took me several sleepless nights ;)