Please keep this in mind when you are working on the lab.?For grading, you need to compile?random_tester_1.cpp?and?random_tester_2.cpp?with your implementation of B-trees.Double-check youself and make sure that you are doing it.?In multiple semesters, students have copied my?random_tester_1?and?random_tester_2?rather than compile their own. Of course, mine pass the gradescripts. You want to make sure that?yours?are passing the gradescripts.
I don't have B-Tree lecture notes that I have written. Instead, I rely on?this web page, by David Carlson and Isidore Minerd, which I think is very well done.
You are going to implement the "B+ Tree" variant, where internal nodes only hold keys and pointers to other nodes, and external nodes hold keys and pointers to values. One change from their description is that the value of a key in an internal node is going to be held in the largest val pointer in the key's predecessor subtree. Let me show an example, lifted from their notes:
In their example, the val pointer for J is the pointer to the left of K. In our trees, the val pointer for J will be the pointer to the right of I. Similarly:
The val pointer for F is the pointer to the right of E.
The val pointer for C is the pointer to the right of B.
The val pointer for M is the pointer to the right of L.
The val pointer for R is the pointer to the right of P.
The val pointer for U is the pointer to the right of T.
The val pointer to the right of Z is unused.
(As a corollary to this, when you delete an internal node, you swap it with the previous node in the tree, not the subsequent one. Since we are not doing deletion, that doesn't matter.....).
Jdisk: A disk emulator
We are using the same disk emulator,?jdisk?as in?The FAT lab. Please see that lab writeup for information on?jdisk.c/jdisk.h/jdisk_tester. Please also see that lab for information on how to view and modify binary files in VI.
What you have to write: b_tree.c
Your job is to write?b_tree.c. Its job is to implement the interface in?b_tree.h:
#ifndef _BP_TREE_
#define _BP_TREE_
#include "jdisk.h"
void *b_tree_create(char *filename, long size, int key_size);
void *b_tree_attach(char *filename);
unsigned int b_tree_insert(void *b_tree, void *key, void *record);
unsigned int b_tree_find(void *b_tree, void *key);
void *b_tree_disk(void *b_tree);
int b_tree_key_size(void *b_tree);
void b_tree_print_tree(void *b_tree);
#endif
What you are going to do is implement B-trees on top of jdisks. The keys will be buffers of exactly?key_size?bytes, which is defined when you create a btree. The vals will be buffers of exactly JDISK_SECTOR_SIZE bytes. Each node of the tree will fit into a sector of the disk.
The jdisks?must?have a specific format. That means that the jdisks that your programs create must be readable by my btree programs and vice versa (so long as they have the same sizes for longs and the same endian-ness). They don't have to be identical to mine. They just have to work. Here is the format:
Sector 0
Sector zero can have anything in it, so long as the first 16 (or 12) bytes contain the following:
?Bytes 0-3: The key size as an integer.
?Bytes 4-7: The LBA of the sector that holds the root node of the B-tree.
?Bytes 8-15 (or 8-11 if longs are 4 bytes): The first free LBA on the jdisk. You are going to allocate sectors consecutively from sector 0, and since you never deallocate a sector (our B-Trees don't allow deletion), you can keep track of the free sectors with a single number.
?Yes, the remaining 1008 bytes are wasted. You can use them, but since your program has to be interoperable with mine and others, their values will be ignored.
Let's stop there. Take a look at a file that is a jdisk for a B-Tree:
UNIX> ls -l tree-1.jdisk
-rw-r--r-- 1 plank loci 2048000 Feb 13 15:57 tree-1.jdisk
UNIX> jdisk_test R tree-1.jdisk hex 0 16
17 00 00 00 29 00 00 00 f1 01 00 00 00 00 00 00
UNIX>
The file is roughly 2MB, composed of 2,000 sectors. When we look at the first 16 bytes, we see the numbers in little endian format (which is the format of our current Intel processors). They key size is 0x17, or 23 bytes. The LBA of the root node is 0x29, and the first free sector is LBA 0x1f1 = 497.
Nodes
Now, the format of a sector holding a node of the tree is as follows. The first byte is a zero or one, specifying whether the node is external (0) or internal (1). The next byte says how many keys are in the node. How many keys can you fit into a node? The answer is (JDISK_SECTOR_SIZE - 6) / (key_size + 4). You'll see why in a minute. Let's call that value MAXKEY. Keys must be between 4 and 254 bytes (inclusive). Therefore, even if keys are four bytes, you can fit the number of keys into an unsigned char.
The next MAXKEY * key_size bytes are the keys. Then there can be some wasted bytes. The last (MAXKEY+1)*4 bytes are the LBA's, which are the pointers of the B-Trees. If the node is internal, then they are the LBA's of nodes that are pointed to by the node. If the node is external, then they are the LBA's of vals. If there are?nkeys?keys in the node, then there are?nkeys+1?LBA's.
Data (the vals)
Data is stored in a sector. The data *must* be JDISK_SECTOR_SIZE bytes.
A detailed example
Let's explore this a little. Let's try?tree-2.jdisk:
UNIX> ls -l tree-2.jdisk
-rw-r--r-- 1 plank staff 20480 Feb 11 22:58 tree-2.jdisk
UNIX> jdisk_test R tree-2.jdisk hex 0 16
c8 00 00 00 08 00 00 00 0c 00 00 00 00 00 00 00
UNIX>
This is a file with 20 sectors. The key size is 200 (0xc8), of which 12 (0x0c) are currently in use. The LBA of the root node is 8. Let's take a look at the first few bytes of that node:
UNIX> echo '8 1024 * p' | dc
8192
UNIX> jdisk_test R tree-2.jdisk hex 8192 32
01 01 45 6c 69 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX>
So, this is an internal node, because the first byte is one. It holds one key, because the second byte is also one. The first key is in the next 200 bytes -- if you examine them, they are all zeros except for the first three. I know they hold a string (because I created this file), so let's look at them:
UNIX> jdisk_test R tree-2.jdisk string 8194 3
Eli
UNIX> jdisk_test R tree-2.jdisk string 8194 200
Eli
UNIX>
Just a note here about keys that are strings -- since they are null terminated,?jdisk_test?will print the string, regardless of whether you specify 3 characters or 200. In the example above,?jdisk_test?is reading 200 characters, but since the fourth character is '\0', it only prints out "Eli".
So, there is one key, which means that there are two pointers out of the node. Each of these pointers is an LBA of the sector holding the node to which the pointer points. How do we find these LBA's? Well, first, let's figure out what MAXKEY is: (1024 - 6) / (200+4) = 4.99. That means MAXKEY is four (and there is quite a bit of wasted space in our nodes. So it goes). Since a node can hold four keys, it can hold 5 LBA pointers. Those are in the last 5*4 bytes of the sector. Let's look at them:
UNIX> jdisk_test R tree-2.jdisk hex 9196 20
01 00 00 00 07 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
UNIX>
So, the first pointer is to block 1, and then second is to block 7. Let's look at block 1:
UNIX> jdisk_test R tree-2.jdisk hex 1024 32
00 04 41 6c 65 78 69 73 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX>
It is an external node that holds four keys. The keys will start at bytes 1026, 1226, 1426 and 1626:
UNIX> jdisk_test R tree-2.jdisk hex 1024 32
00 04 41 6c 65 78 69 73 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1026 16
41 6c 65 78 69 73 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1226 16
41 6c 6c 69 73 6f 6e 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1426 16
43 61 6c 65 62 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1626 16
44 61 6e 69 65 6c 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 1026 16
Alexis
UNIX> jdisk_test R tree-2.jdisk string 1226 16
Allison
UNIX> jdisk_test R tree-2.jdisk string 1426 16
Caleb
UNIX> jdisk_test R tree-2.jdisk string 1626 16
Daniel
UNIX>
Nice -- this is looking like the keys are all string-based (however, they are still 200 characters -- it just so happens that all of the characters after a string are byte 0x0).
Now, let's look at the LBA's, which will start 20 bytes from the end of the sector:
UNIX> jdisk_test R tree-2.jdisk hex 2028 20
04 00 00 00 0b 00 00 00 0a 00 00 00 02 00 00 00
06 00 00 00
UNIX>
These are the vals:
Alexis' val is the sector at LBA 4.
Allison's val is the sector at LBA 11.
Caleb's val is the sector at LBA 10.
Daniel's val is the sector at LBA 2.
And Eli's val is the sector at LBA 6. Remember how vals work for internal nodes!
If you examine these LBA's, you'll see that they are also strings, where all of the bytes after the strings are zero. Here, I'll prove it for LBA 4, and then we'll look at the strings in those five sectors:
UNIX> jdisk_test R tree-2.jdisk hex 4096 32
47 79 72 6f 73 63 6f 70 65 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> echo 1024 16 / p | dc
64
UNIX> jdisk_test R tree-2.jdisk hex 4096 1024 | tail -n 63 | sort -u
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 4096 16
Gyroscope
UNIX> echo 11 1024 '*' p | dc
11264
UNIX> jdisk_test R tree-2.jdisk string 11264 16
Embellish
UNIX> jdisk_test R tree-2.jdisk string 10240 16
Sudanese
UNIX> jdisk_test R tree-2.jdisk string 2048 16
Shadowy
UNIX> echo 6 1024 '*' p | dc
6144
UNIX> jdisk_test R tree-2.jdisk string 6144 16
Eider
UNIX>
So, now we know:
?Key: Alexis - Val: Gyroscope
?Key: Allison - Val: Embellish
?Key: Caleb - Val: Sudanese
?Key: Daniel - Val: Shadowy
?Key: Eli - Val: Eider
Let's look at LBA 7 to see the keys greater than "Eli":
UNIX> echo 7 1024 '*' p | dc
7168
UNIX> jdisk_test R tree-2.jdisk hex 7168 16
00 03 47 72 61 63 65 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 7170 16
Grace
UNIX> jdisk_test R tree-2.jdisk string 7370 16
James
UNIX> jdisk_test R tree-2.jdisk string 7570 16
Landon
UNIX> echo 8 1024 '*' 20 - p | dc
8172
UNIX> jdisk_test R tree-2.jdisk hex 8172 20
05 00 00 00 09 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 5120 16
Procter
UNIX> jdisk_test R tree-2.jdisk string `echo 9 1024 '*' p | dc` 16
Chug
UNIX> jdisk_test R tree-2.jdisk string `echo 3 1024 '*' p | dc` 16
Delinquent
UNIX>
So, armed with that information, we can draw our B-Tree as follows:
I have starred that last LBA, because it is unused.
The procedures in b_tree.h
Here is a description of the procedures that you have to write.
oid *b_tree_create(char *filename, long size, int key_size);?This creates an empty btree with the given file size, key_size and filename. The empty btree will have a root node which is external and has zero keys. It returns a handle to the btree in a void *.
void *b_tree_attach(char *filename);?This opens the given btree file, which should have been created previously with?b_tree_create(). Again, it returns a handle to the btree.
unsigned int b_tree_insert(void *b_tree, void *key, void *record);In this procedure,key is a pointer to? key_size bytes, and record is a pointer to JDISK_SECTOR_SIZE bytes. If the key is in the btree, then the procedure replaces the val with?record, and returns the LBA of the val. If the key is not in the btree, then it is inserted, and the val for that key is set to?record. In either case, the LBA of the val is returned. It will return 0 if our file is out of room. When this returns, the btree file is in the proper shape (in other words,jdisk_write()?calls need to be made for all of the sectors that have been added or changed).
I want to stress here that even though our examples above used null-terminated strings as keys, our btrees can take?any?keys that are?key_size?bytes. Use?memcmp()for key comparison. (And use?memcpy()?to copy keys and vals to their respective homes if need be).
unsigned int b_tree_find(void *b_tree, void *key);?This finds the given key, and returns the LBA of the val. If the key is not in the tree, this returns 0.
void *b_tree_disk(void *b_tree);This returns the jdisk pointer for the btree.
int key_size(void *b_tree);This returns the key size.
void *b_tree_print_tree(void *b_tree);This prints the tree -- see my examples for format. I'm not going to grade you on this. This can be a very useful procedure for debugging.
Useful program #1: b_tree_test
The program?b_tree_test.c?is a nice program to help you debug. It is called as follows:
b_tree_test file [CREATE file_size key_size]
If you call it with "CREATE", then it creates a btree file with the given size and key size. If you don't call it with "CREATE", then it attaches to the preexisting btree file.
Once the file is created, it accepts three commands:
"I key val" --?Key?must be a string less than or equal to the key size, and?val?must be a string less than or equal to JDISK_SECTOR_SIZE. The key is padded to be the key size with zeros, as is the val, to JDISK_SECTOR_SIZE. Then,?b_tree_insert()?is called with this key and value. It prints the LBA of the inserted val.
"F key" -- Finds the key and returns the LBA.
"P" calls?b_tree_print_tree(). If the key strings equal the key_size, this will do some ugly stuff.
So, let's create the tree above:
UNIX> rm tree-2.jdisk
UNIX> b_tree_test tree-2.jdisk CREATE 20480 200
I Daniel Shadowy
Insert return value: 2
I Landon Delinquent
Insert return value: 3
I Alexis Gyroscope
Insert return value: 4
I Grace Procter
Insert return value: 5
P
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x00000002. Key: S Daniel
Entry 2: LBA: 0x00000005. Key: S Grace
Entry 3: LBA: 0x00000003. Key: S Landon
Entry 4: LBA: 0x00000000
I Eli Eider
Insert return value: 6
P
LBA 0x00000008. Internal: 1
Entry 0: LBA: 0x00000001. Key: S Eli
Entry 1: LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x00000002. Key: S Daniel
Entry 2: LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: LBA: 0x00000005. Key: S Grace
Entry 1: LBA: 0x00000003. Key: S Landon
Entry 2: LBA: 0x00000000
I James Chug
Insert return value: 9
I Caleb Sudanese
Insert return value: 10
I Allison Embellish
Insert return value: 11
P
LBA 0x00000008. Internal: 1
Entry 0: LBA: 0x00000001. Key: S Eli
Entry 1: LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x0000000b. Key: S Allison
Entry 2: LBA: 0x0000000a. Key: S Caleb
Entry 3: LBA: 0x00000002. Key: S Daniel
Entry 4: LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: LBA: 0x00000005. Key: S Grace
Entry 1: LBA: 0x00000009. Key: S James
Entry 2: LBA: 0x00000003. Key: S Landon
Entry 3: LBA: 0x00000000
<CNTL-D>
Reads: 15
Writes: 28
UNIX>
Useful program #2: random_tester_1
This is a pretty heavyweight testing program:
random_tester_1 seed nevents key_size val_size tree_file input_file output_file
Here are the parameters:
seed is a seed for srand48()?(It's a long).
nevents is the number of events that you want to generate.
key_size is the size of the key.
val_size is the number of characters that you want in the vals.
tree_file?is the file name. If it doesn't exist, then it will be created with?key_size, and a size of 2*nevents*JDISK_SECTOR_SIZE.
input_file?is an optional file that tells you what is already in the tree. Use "-" to omit.
output_file?is an optional file that stores what's in the tree at the end of the program. You use this file as input for subsequent calls.
So,random_tester_1?randomly generates?b_tree_insert()?and?b_tree_find()?calls with a 50/50 probability. It stores its results internally, and double-checks the LBA's and vals of?b_tree_find()?calls. It generates keys which are random strings whose sizes are less than?key_size. After the strings are null characters. The vals are random strings whose sizes are less than or equal?val_size. All of the characters after the string (up to JDISK_SECTOR_SIZE) are null characters. Thus, these guys aren't too disgusting when you print them.
At the end, it prints the number of reads and writes to the jdisk. Here are some examples:
UNIX> random_tester_1 0 100 50 30 tree-3.jdisk - tree-3.txt > tmp-output.txt
Take a look at the program's output, in?tmp-output.txt. Here are the first few lines:
UNIX> head -n 5 tmp-output.txt
I mjeglqyuapnuiutrwhuvvjmvgglhhapmuclvaynkuh ajujjsdaaeuuuzhqroq
I xcedrqfnxavvqfguowkwpx jlkrkbpjgahf
I lgltigobrvwkgathopicmh sxstblhiqfyowhyvefbweptotgp
I bkcz ybsmyalyricsxgmptjelqds
F xcedrqfnxavvqfguowkwpx
UNIX>
It inserted four keys/vals, and then it performed a find on "xcedrqfnxavvqfguowkwpx". After that find, it double-checked that the LBA matched the original insert, and that the bytes are "jlkrkbpjgahf". We can double-check that with?b_tree_test?and?jdisk_test:
UNIX> echo F xcedrqfnxavvqfguowkwpx | b_tree_test tree-3.jdisk
Attached to tree-3.jdisk. FS: 10240000 - KS: 50
Find return value: 3
Reads: 3
Writes: 0
UNIX> jdisk_test R tree-3.jdisk string `echo 3 1024 '*' p | dc` 1024
jlkrkbpjgahf
UNIX>
The file?tree-3.txt?has the output from?random_tester_1. I can use it to make another call to?random_tester_1?that attaches to the tree that it just created. The input file tells me what's in the tree (keys, vals and LBA's). Now it generates new events (using the old keys and the new keys for finding):
UNIX> random_tester_1 1 10 50 30 tree-3.jdisk tree-3.txt -
I ndbqtrkuwgsgmilthoqwkhsvhxerjetjbcxzakw ctvztvyrhnbig
I hwe delscbm
F auaerduagfi
F dmkaaoptzhmyszbbybrfzfqyyfshbq
F yh
I murekjom mxv
I epunbavuprsytivhufivhxhpt j
F jblrqvbhhtssacoxqeksrxosrhnhpeiqmxjjaxrfiue
F riosrojnkzqcyqgkasvfwybfiqgpvfzacsfbcodp
I wdxszkdgjxqyakiqlweuah lt
Reads: 26
Writes: 26
UNIX>
You'll note, it called?b_tree_find()?on "yh", which was in the old tree, and made sure that the LBA and val for that key is what it was when we inserted it on the first call to?random_tester_1.
random_tester_1?will be a good way for you to test that your btree programs and mine are interoperable. For example, you can call it once with my version of?random_tester_1, and then again on your version with the input generated from the first call. They should both work together!
Life without a net:?random_tester_2
The parameters to?random_tester_2?are similar to?random_tester_1:
random_tester_2 seed nevents key_size tree_file input_file output_file
The only parameter that is missing is?val_size.?random_tester_2?now generates random keys that are exactly?key_size?bytes, and they can be?any?bytes. No longer are they strings. It also generates random vals that are exactly JDISK_SECTOR_SIZE bytes. This is the ultimate test for your programs, because you can't debug with nice strings. You'll have to debug with a little moxy.
The input and output files are now binary.
b_tree_dcs
I have a program called?b_tree_dcs, which stands for "B-Tree Double-Checker and Serializer." It takes a B-Tree file as its command line argument, and the first thing that it does is double-check that it is a valid B-Tree. You may find that useful for debugging (I did). After that, it prints all the keys out in sorted order, and then all the vals in the same order as the keys. If a key or val is a string followed by all null characters to fill out the buffer, then it prints the key or val with "S" and a string. Otherwise, it prints "H" and all of the bytes in hex. In that way, you can actually look at the trees of?random_tester_2?without resorting completely to?jdisk_test?or VI:
UNIX> b_tree_dcs tree-2.jdisk
Key_Size: 200
Key 0: S Alexis
Key 1: S Allison
Key 2: S Caleb
Key 3: S Daniel
Key 4: S Eli
Key 5: S Grace
Key 6: S James
Key 7: S Landon
Val 0: S Gyroscope
Val 1: S Embellish
Val 2: S Sudanese
Val 3: S Shadowy
Val 4: S Eider
Val 5: S Procter
Val 6: S Chug
Val 7: S Delinquent
UNIX> rm -f tmp.jdisk; random_tester_2 0 2 25 tmp.jdisk - - > /dev/null
UNIX> b_tree_dcs tmp.jdisk
Key_Size: 25
Key 0: H A95E509709556B8137127CE6160C538C53E4488F4B083ADA9A
Key 1: H C00A91D0F269AEBAB4483B0B5CEA668381C81552B9297D3A7D
Val 0: H 7824B141A5586A73755C3CFC645E2F4FC454332A25B44DE8CFCDEC636CC ...
Val 1: H 5AD463298B1EBB204624070D8C848B448EE15246053A1B632E59E07A666 ...
UNIX>
The Grading Script, in grader.sh
You asked for a grading script, so I have provided. In?grader.sh, you specify two command line arguments: A number and Y|N. The second argument says whether the script will delete its files or not. The first number can be any number. The script will create two jdisk files:?test-answer.jdisk?and?test-$USER.jdisk. The first one is created by the programs in the lab directory. The second one is created using the programs in the current directory. When the jdisk files are created, they are compared with?b_tree_dcs, and if the outputs match, the problem is correct.
For example, gradescript number 1 uses?b_tree_test?to insert one key into the B-tree file:
UNIX> sh /home/plank/cs494/labs/Lab-2-B-Tree-Btree/grader.sh 1 N
1
Problem 1 Correct.
tree-answer.jdisk and tree-plank.jdisk created as follows:
----------------------------------------------------------
rm -f tree-plank.jdisk tree-answer.jdisk
/home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk CREATE 51200 26 < input.txt > /dev/null
./b_tree_test tree-plank.jdisk CREATE 51200 26 < input.txt > /dev/null
UNIX> cat input.txt
I Mackenzie Malignant
UNIX> b_tree_dcs tree-answer.jdisk
Key_Size: 26
Key 0: S Mackenzie
Val 0: S Malignant
UNIX> b_tree_dcs tree-plank.jdisk
Key_Size: 26
Key 0: S Mackenzie
Val 0: S Malignant
UNIX>
Gradescript number 2 is a little more complex -- it creates an input file with 161 random insertions (some of which will replace keys). It calls the?b_disk_test?program in the lab directory to create?tree-answer.jdisk, from the first 137 entries of the input file. Then it copies?tree-answer.jdisk?to?tree-$USER.jdisk, and processes the remaining 24 insertions using the lab program and your program:
UNIX> sh /home/plank/cs494/labs/Lab-2-B-Tree-Btree/grader.sh 2 N
2
Problem 2 Correct.
tree-answer.jdisk and tree-plank.jdisk created as follows:
----------------------------------------------------------
rm -f tree-plank.jdisk tree-answer.jdisk
head -n 137 input.txt | /home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk CREATE 614400 129 > /dev/null
cp tree-answer.jdisk tree-plank.jdisk
sed 1,137d input.txt | /home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk > /dev/null
sed 1,137d input.txt | ./b_tree_test tree-plank.jdisk > /dev/null
UNIX> wc input.txt
161 483 2807 input.txt
UNIX> head input.txt
I Blake Shelter
I Mackenzie Effluvium
I Brianna Reluctant
I Evelyn Vigilantism
I Mason Drawn
I Anthony Singlet
I Anna Microjoule
I Kate Howell
I Paige Tinder
I Xavier Roulette
UNIX> tail input.txt
I Peyton Tammany
I Caleb Abe
I Landon Knick
I Makayla Fortescue
I Brody Repeat
I Nathan Swizzle
I Madison Nonetheless
I Ava Astronautic
I Zoey Indebted
I Chloe Smirk
UNIX> b_tree_dcs tree-answer.jdisk | head
Key_Size: 129
Key 0: S Aaron
Key 1: S Abigail
Key 2: S Addison
Key 3: S Aidan
Key 4: S Aiden
Key 5: S Alex
Key 6: S Alexander
Key 7: S Alexis
Key 8: S Allison
UNIX>
The tests performed by the gradescript are based on the number given mod ten. The tests are as follows:
?Number mod 10 = 1: 1 to 15 random insertions using?b_tree_test. Key size is between 15 and 50.
Number mod 10 = 2: 1 to 150 random insertions using the lab directory's?b_tree_test. Key size is between 15 and 214. Then 1 to 150 more random insertions using?b_tree_test.
Number mod 10 = 3: Calling?random_tester_1?with the gradescript number as the seed. 1 to 20 events; Key size between 15 and 214; Val size between 1 and 200.
Number mod 10 = 4: Calling?random_tester_1?with the gradescript number as the seed. 1 to 500 events; Key size between 15 and 252; Val size between 1 and 1023.
Number mod 10 = 5: Calling?random_tester_2?with the gradescript number as the seed. 1 to 500 events; Key size between 15 and 252.
Number mod 10 = 6: Calling?random_tester_2?with the gradescript number as the seed. 1 to 5000 events; Key size between 4 and 10.
Number mod 10 = 7: Calling?random_tester_2?with the gradescript number as the seed. 1 to 5000 events; Key size = 254.
Number mod 10 = 8: Calling?b_tree_test?to create a b-tree of appropriate size. Calling?random_tester_2?in the lab directory to with up to 5000 events, key size = 254. The resulting tree is then copied to?tree-$USER.jdisk, and your?random_tester_2?is called for another 1 to 5000 events.
Number mod 10 = 9: Same as test 8, only now the key size is between 4 and 10.
Number mod 10 = 0: Same as test 9, only now the key size is between 4 and 254.
Each of these can take a few seconds, because the files can be in the 10's of megabytes.
Some help
You don't need this section, but you probably should read it. My internal btree struct was as follows:
typedef struct {
int key_size; /* These are the first 16/12 bytes in sector 0 */
unsigned int root_lba;
unsigned long first_free_block;
void *disk; /* The jdisk */
unsigned long size; /* The jdisk's size */
unsigned long num_lbas; /* size/JDISK_SECTOR_SIZE */
int keys_per_block; /* MAXKEY */
int lbas_per_block; /* MAXKEY+1 */
Tree_Node *free_list; /* Free list of nodes */
Tree_Node *tmp_e; /* When find() fails, this is a pointer to the external node */
int tmp_e_index; /* and the index where the key should have gone */
int flush; /* Should I flush sector[0] to disk after b_tree_insert() */
} B_Tree;
Note how the first three fields are such that you can write them to sector 0 (and read them from sector 0). As it turns out, I write the whole struct to sector 0, but the remaining bytes are ignored.
A?Tree_Node?is the internal representation of a node. Here's its struct:
typedef struct tnode {
unsigned char bytes[JDISK_SECTOR_SIZE+256]; /* This holds the sector for reading and writing.
It has extra room because your internal representation
will hold an extra key. */
unsigned char nkeys; /* Number of keys in the node */
unsigned char flush; /* Should I flush this to disk at the end of b_tree_insert()? */
unsigned char internal; /* Internal or external node */
unsigned int lba; /* LBA when the node is flushed */
unsigned char **keys; /* Pointers to the keys. Size = MAXKEY+1 */
unsigned int *lbas; /* Pointer to the array of LBA's. Size = MAXKEY+2 */
struct tnode *parent; /* Pointer to my parent -- useful for splitting */
int parent_index; /* My index in my parent */
struct tnode *ptr; /* Free list link */
} Tree_Node;
I maintain my own free list of?Tree_Node?structs. When I allocate one, I do three?malloc()?calls. One is for the?Tree_Node?itself, one is for?keys?and one is for?lbas. I set the values of?keys?right after I allocate them, since they point to fixed locations in?bytes. For example,?keys[0]?will point to?bytes+2.?keys[1]?will point to?bytes+2+key_size. Etc. The size of?keys?is?MAXKEY+1. Moreover, the size of?lbas?is?MAXKEY+2. The reason is that in my internal representation of a B-Tree node, I am allowed to store an extra key and val. This simplifies the implementation of splitting in an?enormous?way. If you don't believe me, ask anyone in the 2015 version of CS494 who tried to implement B-Trees without it. When I'm done with a?Tree_Node, I put it onto my free list. That way, if I need a new?Tree_Node?later, and there's one on the free list, I don't have to do any?malloc()?calls -- I simply grab it from the free list.?Keys?and?lbas?are already allocated, and the values of?keys?are already set correctly. That's convenient.
I have a procedure which reads a?Tree_Node?from a jdisk. It reads it into?bytes, and then it copies the LBA's from the end of the sector into?lbas?(using?memcpy). It has to do this, because it uses the end of the sector for that extra key. It also reads?nkeys?and?internal?from the sector. A convenient thing is that the?keys?pointers are already pointing to the correct place, so you don't need to do anything special with the keys.
I set the?flush?field whenever I modify a B-Tree node, and the modified node needs to be flushed to disk. I do the final flushing at the end of?b_tree_insert(). When I flush a node, I need to create the sector. I do this by copying?nkeys?and?internal?into their proper place in?bytes. I then copy?lbas?to their proper place in?bytes. The keys are already in their proper place. I then write?bytes?using?jdisk_write().
What I did for splitting was as follows -- I'd go ahead and insert the key/lba. Since there's room for an extra key and lba in the?Tree_Node, this works even when the node is full. I then checked to see if the node needed to be split, and if so, I called a procedure called?split_node. This procedure has to be recursive, BTW.
When I call?find(), I have it read in all of the nodes on the path to the external node, setting their parent fields, but setting?flush?to 0. The?flush?field is set by?b_tree_insert()?when the node changes. The parent fields are used on splitting. When I'm done with?b_tree_insert()?or?b_tree_find(), I call?free_and_flush()?on every node from the external node up to the root. This flushes the node if?flush?is set, and puts the node onto the free list. I found?free_and_flush?super-helpful.
b_tree_instrument.o: Instrumented code.
In?b_tree_instrument.o, I have added code that prints my?Tree_Node?just after reading, and just before flushing. I have compiled it with?b_tree_test?to make the executable?b_tree_test_inst. This may help you figure out how my pointers work. Here's an example, where I insert a new node into?tree-2.jdisk. First, here's what I'm doing:
UNIX> cp tree-2.jdisk tmp.jdisk
UNIX> echo P | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
LBA 0x00000008. Internal: 1
Entry 0: Key: Eli . LBA: 0x00000001
Entry 1: . LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: Key: Alexis . LBA: 0x00000004
Entry 1: Key: Allison . LBA: 0x0000000b
Entry 2: Key: Caleb . LBA: 0x0000000a
Entry 3: Key: Daniel . LBA: 0x00000002
Entry 4: . LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: Key: Grace . LBA: 0x00000005
Entry 1: Key: James . LBA: 0x00000009
Entry 2: Key: Landon . LBA: 0x00000003
Entry 3: . LBA: 0x00000000
Reads: 4
Writes: 0
UNIX> echo I A-Aron Stoae | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
Insert return value: 12
Reads: 3
Writes: 5
UNIX> echo P | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
LBA 0x00000008. Internal: 1
Entry 0: Key: Allison . LBA: 0x00000001
Entry 1: Key: Eli . LBA: 0x0000000d
Entry 2: . LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: Key: A-Aron . LBA: 0x0000000c
Entry 1: Key: Alexis . LBA: 0x00000004
Entry 2: . LBA: 0x0000000b
LBA 0x0000000d. Internal: 0
Entry 0: Key: Caleb . LBA: 0x0000000a
Entry 1: Key: Daniel . LBA: 0x00000002
Entry 2: . LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: Key: Grace . LBA: 0x00000005
Entry 1: Key: James . LBA: 0x00000009
Entry 2: Key: Landon . LBA: 0x00000003
Entry 3: . LBA: 0x00000000
Reads: 5
Writes: 0
UNIX>
Now, here are the tree nodes as they are read in the beginning of the insertion, and as they are flushed after the insertion:
UNIX> cp tree-2.jdisk tmp.jdisk
UNIX> echo I A-Aron Stoae | b_tree_test_inst tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
Find(): Read -- Tree Node 0x1e840c0
Bytes: 0x1e840c0
Nkeys Address: 0x1e845c0
Nkeys Value: 1
Flush Value: 0
Internal Value: 1
LBA Value: 8
Keys: 0x1e84600 keys is allocated with malloc().
Keys[0] 0x1e840c2 (Eli) This is bytes+2
Keys[1] 0x1e8418a (Landon) This is bytes+2+key_size. Since nkeys equals 1, the contents here are
meaningless. They happen to hold "Landon" from a previous time,
when "Landon" was the second key. You have to ignore this, because
Nkeys is one.
Keys[2] 0x1e84252 () This is bytes+2+key_size*2
Keys[3] 0x1e8431a ()
Keys[4] 0x1e843e2 () This is the extra key.
Lbas: 0x1e84630 lbas is allocated with malloc().
Lbas[0] 0x1 Lba of the internal node before "Eli".
Lbas[1] 0x7 Lba of the internal node after "Eli".
Lbas[2] 0x0 The value of these is meaningless. They happen to be zero.
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0 This is the extra lba
Parent 0x0
Parent_index -1
&ptr 0x1e845e8 This is meaningless, because the tree_node isn't on the free list.
Find(): Read -- Tree Node 0x1e84650
Bytes: 0x1e84650
Nkeys Address: 0x1e84b50
Nkeys Value: 4
Flush Value: 0
Internal Value: 0
LBA Value: 1
Keys: 0x1e84b90
Keys[0] 0x1e84652 (Alexis)
Keys[1] 0x1e8471a (Allison)
Keys[2] 0x1e847e2 (Caleb)
Keys[3] 0x1e848aa (Daniel)
Keys[4] 0x1e84972 ()
Lbas: 0x1e84bc0
Lbas[0] 0x4
Lbas[1] 0xb
Lbas[2] 0xa
Lbas[3] 0x2
Lbas[4] 0x6
Lbas[5] 0x0
Parent 0x1e840c0
Parent_index 0
&ptr 0x1e84b78
Free_and_flush(): Write -- Tree Node 0x1e84be0
Bytes: 0x1e84be0
Nkeys Address: 0x1e850e0
Nkeys Value: 2
Flush Value: 1
Internal Value: 0
LBA Value: 13
Keys: 0x1e85120
Keys[0] 0x1e84be2 (Caleb)
Keys[1] 0x1e84caa (Daniel)
Keys[2] 0x1e84d72 ()
Keys[3] 0x1e84e3a ()
Keys[4] 0x1e84f02 ()
Lbas: 0x1e85150
Lbas[0] 0xa
Lbas[1] 0x2
Lbas[2] 0x6
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0
Parent 0x0
Parent_index 0
&ptr 0x1e85108
Free_and_flush(): Write -- Tree Node 0x1e84650
Bytes: 0x1e84650
Nkeys Address: 0x1e84b50
Nkeys Value: 2
Flush Value: 1
Internal Value: 0
LBA Value: 1
Keys: 0x1e84b90
Keys[0] 0x1e84652 (A-Aron)
Keys[1] 0x1e8471a (Alexis)
Keys[2] 0x1e847e2 (Allison) These are leftover from before the split. nkeys is 2,
Keys[3] 0x1e848aa (Caleb) so they are ignored, but they are written to disk.
Keys[4] 0x1e84972 (Daniel) As you can see there is room for 5 keys, which makes splitting easier.
Lbas: 0x1e84bc0
Lbas[0] 0xc
Lbas[1] 0x4
Lbas[2] 0xb
Lbas[3] 0xa Same with these LBA's.
Lbas[4] 0x2
Lbas[5] 0x6
Parent 0x1e840c0
Parent_index 0
&ptr 0x1e84b78
Free_and_flush(): Write -- Tree Node 0x1e840c0
Bytes: 0x1e840c0
Nkeys Address: 0x1e845c0
Nkeys Value: 2
Flush Value: 1
Internal Value: 1
LBA Value: 8
Keys: 0x1e84600
Keys[0] 0x1e840c2 (Allison)
Keys[1] 0x1e8418a (Eli)
Keys[2] 0x1e84252 ()
Keys[3] 0x1e8431a ()
Keys[4] 0x1e843e2 ()
Lbas: 0x1e84630
Lbas[0] 0x1
Lbas[1] 0xd
Lbas[2] 0x7
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0
Parent 0x0
Parent_index -1
&ptr 0x1e845e8
Insert return value: 12
Reads: 3
Writes: 5
UNIX>
Please keep this in mind when you are working on the lab.?For grading, you need to compile?random_tester_1.cpp?and?random_tester_2.cpp?with your implementation of B-trees.Double-check youself and make sure that you are doing it.?In multiple semesters, students have copied my?random_tester_1?and?random_tester_2?rather than compile their own. Of course, mine pass the gradescripts. You want to make sure that?yours?are passing the gradescripts.
I don't have B-Tree lecture notes that I have written. Instead, I rely on?this web page, by David Carlson and Isidore Minerd, which I think is very well done.
You are going to implement the "B+ Tree" variant, where internal nodes only hold keys and pointers to other nodes, and external nodes hold keys and pointers to values. One change from their description is that the value of a key in an internal node is going to be held in the largest val pointer in the key's predecessor subtree. Let me show an example, lifted from their notes:
In their example, the val pointer for J is the pointer to the left of K. In our trees, the val pointer for J will be the pointer to the right of I. Similarly:
The val pointer for F is the pointer to the right of E.
The val pointer for C is the pointer to the right of B.
The val pointer for M is the pointer to the right of L.
The val pointer for R is the pointer to the right of P.
The val pointer for U is the pointer to the right of T.
The val pointer to the right of Z is unused.
(As a corollary to this, when you delete an internal node, you swap it with the previous node in the tree, not the subsequent one. Since we are not doing deletion, that doesn't matter.....).
Jdisk: A disk emulator
We are using the same disk emulator,?jdisk?as in?The FAT lab. Please see that lab writeup for information on?jdisk.c/jdisk.h/jdisk_tester. Please also see that lab for information on how to view and modify binary files in VI.
What you have to write: b_tree.c
Your job is to write?b_tree.c. Its job is to implement the interface in?b_tree.h:
#ifndef _BP_TREE_
#define _BP_TREE_
#include "jdisk.h"
void *b_tree_create(char *filename, long size, int key_size);
void *b_tree_attach(char *filename);
unsigned int b_tree_insert(void *b_tree, void *key, void *record);
unsigned int b_tree_find(void *b_tree, void *key);
void *b_tree_disk(void *b_tree);
int b_tree_key_size(void *b_tree);
void b_tree_print_tree(void *b_tree);
#endif
What you are going to do is implement B-trees on top of jdisks. The keys will be buffers of exactly?key_size?bytes, which is defined when you create a btree. The vals will be buffers of exactly JDISK_SECTOR_SIZE bytes. Each node of the tree will fit into a sector of the disk.
The jdisks?must?have a specific format. That means that the jdisks that your programs create must be readable by my btree programs and vice versa (so long as they have the same sizes for longs and the same endian-ness). They don't have to be identical to mine. They just have to work. Here is the format:
Sector 0
Sector zero can have anything in it, so long as the first 16 (or 12) bytes contain the following:
?Bytes 0-3: The key size as an integer.
?Bytes 4-7: The LBA of the sector that holds the root node of the B-tree.
?Bytes 8-15 (or 8-11 if longs are 4 bytes): The first free LBA on the jdisk. You are going to allocate sectors consecutively from sector 0, and since you never deallocate a sector (our B-Trees don't allow deletion), you can keep track of the free sectors with a single number.
?Yes, the remaining 1008 bytes are wasted. You can use them, but since your program has to be interoperable with mine and others, their values will be ignored.
Let's stop there. Take a look at a file that is a jdisk for a B-Tree:
UNIX> ls -l tree-1.jdisk
-rw-r--r-- 1 plank loci 2048000 Feb 13 15:57 tree-1.jdisk
UNIX> jdisk_test R tree-1.jdisk hex 0 16
17 00 00 00 29 00 00 00 f1 01 00 00 00 00 00 00
UNIX>
The file is roughly 2MB, composed of 2,000 sectors. When we look at the first 16 bytes, we see the numbers in little endian format (which is the format of our current Intel processors). They key size is 0x17, or 23 bytes. The LBA of the root node is 0x29, and the first free sector is LBA 0x1f1 = 497.
Nodes
Now, the format of a sector holding a node of the tree is as follows. The first byte is a zero or one, specifying whether the node is external (0) or internal (1). The next byte says how many keys are in the node. How many keys can you fit into a node? The answer is (JDISK_SECTOR_SIZE - 6) / (key_size + 4). You'll see why in a minute. Let's call that value MAXKEY. Keys must be between 4 and 254 bytes (inclusive). Therefore, even if keys are four bytes, you can fit the number of keys into an unsigned char.
The next MAXKEY * key_size bytes are the keys. Then there can be some wasted bytes. The last (MAXKEY+1)*4 bytes are the LBA's, which are the pointers of the B-Trees. If the node is internal, then they are the LBA's of nodes that are pointed to by the node. If the node is external, then they are the LBA's of vals. If there are?nkeys?keys in the node, then there are?nkeys+1?LBA's.
Data (the vals)
Data is stored in a sector. The data *must* be JDISK_SECTOR_SIZE bytes.
A detailed example
Let's explore this a little. Let's try?tree-2.jdisk:
UNIX> ls -l tree-2.jdisk
-rw-r--r-- 1 plank staff 20480 Feb 11 22:58 tree-2.jdisk
UNIX> jdisk_test R tree-2.jdisk hex 0 16
c8 00 00 00 08 00 00 00 0c 00 00 00 00 00 00 00
UNIX>
This is a file with 20 sectors. The key size is 200 (0xc8), of which 12 (0x0c) are currently in use. The LBA of the root node is 8. Let's take a look at the first few bytes of that node:
UNIX> echo '8 1024 * p' | dc
8192
UNIX> jdisk_test R tree-2.jdisk hex 8192 32
01 01 45 6c 69 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX>
So, this is an internal node, because the first byte is one. It holds one key, because the second byte is also one. The first key is in the next 200 bytes -- if you examine them, they are all zeros except for the first three. I know they hold a string (because I created this file), so let's look at them:
UNIX> jdisk_test R tree-2.jdisk string 8194 3
Eli
UNIX> jdisk_test R tree-2.jdisk string 8194 200
Eli
UNIX>
Just a note here about keys that are strings -- since they are null terminated,?jdisk_test?will print the string, regardless of whether you specify 3 characters or 200. In the example above,?jdisk_test?is reading 200 characters, but since the fourth character is '\0', it only prints out "Eli".
So, there is one key, which means that there are two pointers out of the node. Each of these pointers is an LBA of the sector holding the node to which the pointer points. How do we find these LBA's? Well, first, let's figure out what MAXKEY is: (1024 - 6) / (200+4) = 4.99. That means MAXKEY is four (and there is quite a bit of wasted space in our nodes. So it goes). Since a node can hold four keys, it can hold 5 LBA pointers. Those are in the last 5*4 bytes of the sector. Let's look at them:
UNIX> jdisk_test R tree-2.jdisk hex 9196 20
01 00 00 00 07 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
UNIX>
So, the first pointer is to block 1, and then second is to block 7. Let's look at block 1:
UNIX> jdisk_test R tree-2.jdisk hex 1024 32
00 04 41 6c 65 78 69 73 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX>
It is an external node that holds four keys. The keys will start at bytes 1026, 1226, 1426 and 1626:
UNIX> jdisk_test R tree-2.jdisk hex 1024 32
00 04 41 6c 65 78 69 73 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1026 16
41 6c 65 78 69 73 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1226 16
41 6c 6c 69 73 6f 6e 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1426 16
43 61 6c 65 62 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk hex 1626 16
44 61 6e 69 65 6c 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 1026 16
Alexis
UNIX> jdisk_test R tree-2.jdisk string 1226 16
Allison
UNIX> jdisk_test R tree-2.jdisk string 1426 16
Caleb
UNIX> jdisk_test R tree-2.jdisk string 1626 16
Daniel
UNIX>
Nice -- this is looking like the keys are all string-based (however, they are still 200 characters -- it just so happens that all of the characters after a string are byte 0x0).
Now, let's look at the LBA's, which will start 20 bytes from the end of the sector:
UNIX> jdisk_test R tree-2.jdisk hex 2028 20
04 00 00 00 0b 00 00 00 0a 00 00 00 02 00 00 00
06 00 00 00
UNIX>
These are the vals:
Alexis' val is the sector at LBA 4.
Allison's val is the sector at LBA 11.
Caleb's val is the sector at LBA 10.
Daniel's val is the sector at LBA 2.
And Eli's val is the sector at LBA 6. Remember how vals work for internal nodes!
If you examine these LBA's, you'll see that they are also strings, where all of the bytes after the strings are zero. Here, I'll prove it for LBA 4, and then we'll look at the strings in those five sectors:
UNIX> jdisk_test R tree-2.jdisk hex 4096 32
47 79 72 6f 73 63 6f 70 65 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> echo 1024 16 / p | dc
64
UNIX> jdisk_test R tree-2.jdisk hex 4096 1024 | tail -n 63 | sort -u
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 4096 16
Gyroscope
UNIX> echo 11 1024 '*' p | dc
11264
UNIX> jdisk_test R tree-2.jdisk string 11264 16
Embellish
UNIX> jdisk_test R tree-2.jdisk string 10240 16
Sudanese
UNIX> jdisk_test R tree-2.jdisk string 2048 16
Shadowy
UNIX> echo 6 1024 '*' p | dc
6144
UNIX> jdisk_test R tree-2.jdisk string 6144 16
Eider
UNIX>
So, now we know:
?Key: Alexis - Val: Gyroscope
?Key: Allison - Val: Embellish
?Key: Caleb - Val: Sudanese
?Key: Daniel - Val: Shadowy
?Key: Eli - Val: Eider
Let's look at LBA 7 to see the keys greater than "Eli":
UNIX> echo 7 1024 '*' p | dc
7168
UNIX> jdisk_test R tree-2.jdisk hex 7168 16
00 03 47 72 61 63 65 00 00 00 00 00 00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 7170 16
Grace
UNIX> jdisk_test R tree-2.jdisk string 7370 16
James
UNIX> jdisk_test R tree-2.jdisk string 7570 16
Landon
UNIX> echo 8 1024 '*' 20 - p | dc
8172
UNIX> jdisk_test R tree-2.jdisk hex 8172 20
05 00 00 00 09 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00
UNIX> jdisk_test R tree-2.jdisk string 5120 16
Procter
UNIX> jdisk_test R tree-2.jdisk string `echo 9 1024 '*' p | dc` 16
Chug
UNIX> jdisk_test R tree-2.jdisk string `echo 3 1024 '*' p | dc` 16
Delinquent
UNIX>
So, armed with that information, we can draw our B-Tree as follows:
I have starred that last LBA, because it is unused.
The procedures in b_tree.h
Here is a description of the procedures that you have to write.
oid *b_tree_create(char *filename, long size, int key_size);?This creates an empty btree with the given file size, key_size and filename. The empty btree will have a root node which is external and has zero keys. It returns a handle to the btree in a void *.
void *b_tree_attach(char *filename);?This opens the given btree file, which should have been created previously with?b_tree_create(). Again, it returns a handle to the btree.
unsigned int b_tree_insert(void *b_tree, void *key, void *record);In this procedure,key is a pointer to? key_size bytes, and record is a pointer to JDISK_SECTOR_SIZE bytes. If the key is in the btree, then the procedure replaces the val with?record, and returns the LBA of the val. If the key is not in the btree, then it is inserted, and the val for that key is set to?record. In either case, the LBA of the val is returned. It will return 0 if our file is out of room. When this returns, the btree file is in the proper shape (in other words,jdisk_write()?calls need to be made for all of the sectors that have been added or changed).
I want to stress here that even though our examples above used null-terminated strings as keys, our btrees can take?any?keys that are?key_size?bytes. Use?memcmp()for key comparison. (And use?memcpy()?to copy keys and vals to their respective homes if need be).
unsigned int b_tree_find(void *b_tree, void *key);?This finds the given key, and returns the LBA of the val. If the key is not in the tree, this returns 0.
void *b_tree_disk(void *b_tree);This returns the jdisk pointer for the btree.
int key_size(void *b_tree);This returns the key size.
void *b_tree_print_tree(void *b_tree);This prints the tree -- see my examples for format. I'm not going to grade you on this. This can be a very useful procedure for debugging.
Useful program #1: b_tree_test
The program?b_tree_test.c?is a nice program to help you debug. It is called as follows:
b_tree_test file [CREATE file_size key_size]
If you call it with "CREATE", then it creates a btree file with the given size and key size. If you don't call it with "CREATE", then it attaches to the preexisting btree file.
Once the file is created, it accepts three commands:
"I key val" --?Key?must be a string less than or equal to the key size, and?val?must be a string less than or equal to JDISK_SECTOR_SIZE. The key is padded to be the key size with zeros, as is the val, to JDISK_SECTOR_SIZE. Then,?b_tree_insert()?is called with this key and value. It prints the LBA of the inserted val.
"F key" -- Finds the key and returns the LBA.
"P" calls?b_tree_print_tree(). If the key strings equal the key_size, this will do some ugly stuff.
So, let's create the tree above:
UNIX> rm tree-2.jdisk
UNIX> b_tree_test tree-2.jdisk CREATE 20480 200
I Daniel Shadowy
Insert return value: 2
I Landon Delinquent
Insert return value: 3
I Alexis Gyroscope
Insert return value: 4
I Grace Procter
Insert return value: 5
P
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x00000002. Key: S Daniel
Entry 2: LBA: 0x00000005. Key: S Grace
Entry 3: LBA: 0x00000003. Key: S Landon
Entry 4: LBA: 0x00000000
I Eli Eider
Insert return value: 6
P
LBA 0x00000008. Internal: 1
Entry 0: LBA: 0x00000001. Key: S Eli
Entry 1: LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x00000002. Key: S Daniel
Entry 2: LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: LBA: 0x00000005. Key: S Grace
Entry 1: LBA: 0x00000003. Key: S Landon
Entry 2: LBA: 0x00000000
I James Chug
Insert return value: 9
I Caleb Sudanese
Insert return value: 10
I Allison Embellish
Insert return value: 11
P
LBA 0x00000008. Internal: 1
Entry 0: LBA: 0x00000001. Key: S Eli
Entry 1: LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: LBA: 0x00000004. Key: S Alexis
Entry 1: LBA: 0x0000000b. Key: S Allison
Entry 2: LBA: 0x0000000a. Key: S Caleb
Entry 3: LBA: 0x00000002. Key: S Daniel
Entry 4: LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: LBA: 0x00000005. Key: S Grace
Entry 1: LBA: 0x00000009. Key: S James
Entry 2: LBA: 0x00000003. Key: S Landon
Entry 3: LBA: 0x00000000
<CNTL-D>
Reads: 15
Writes: 28
UNIX>
Useful program #2: random_tester_1
This is a pretty heavyweight testing program:
random_tester_1 seed nevents key_size val_size tree_file input_file output_file
Here are the parameters:
seed is a seed for srand48()?(It's a long).
nevents is the number of events that you want to generate.
key_size is the size of the key.
val_size is the number of characters that you want in the vals.
tree_file?is the file name. If it doesn't exist, then it will be created with?key_size, and a size of 2*nevents*JDISK_SECTOR_SIZE.
input_file?is an optional file that tells you what is already in the tree. Use "-" to omit.
output_file?is an optional file that stores what's in the tree at the end of the program. You use this file as input for subsequent calls.
So,random_tester_1?randomly generates?b_tree_insert()?and?b_tree_find()?calls with a 50/50 probability. It stores its results internally, and double-checks the LBA's and vals of?b_tree_find()?calls. It generates keys which are random strings whose sizes are less than?key_size. After the strings are null characters. The vals are random strings whose sizes are less than or equal?val_size. All of the characters after the string (up to JDISK_SECTOR_SIZE) are null characters. Thus, these guys aren't too disgusting when you print them.
At the end, it prints the number of reads and writes to the jdisk. Here are some examples:
UNIX> random_tester_1 0 100 50 30 tree-3.jdisk - tree-3.txt > tmp-output.txt
Take a look at the program's output, in?tmp-output.txt. Here are the first few lines:
UNIX> head -n 5 tmp-output.txt
I mjeglqyuapnuiutrwhuvvjmvgglhhapmuclvaynkuh ajujjsdaaeuuuzhqroq
I xcedrqfnxavvqfguowkwpx jlkrkbpjgahf
I lgltigobrvwkgathopicmh sxstblhiqfyowhyvefbweptotgp
I bkcz ybsmyalyricsxgmptjelqds
F xcedrqfnxavvqfguowkwpx
UNIX>
It inserted four keys/vals, and then it performed a find on "xcedrqfnxavvqfguowkwpx". After that find, it double-checked that the LBA matched the original insert, and that the bytes are "jlkrkbpjgahf". We can double-check that with?b_tree_test?and?jdisk_test:
UNIX> echo F xcedrqfnxavvqfguowkwpx | b_tree_test tree-3.jdisk
Attached to tree-3.jdisk. FS: 10240000 - KS: 50
Find return value: 3
Reads: 3
Writes: 0
UNIX> jdisk_test R tree-3.jdisk string `echo 3 1024 '*' p | dc` 1024
jlkrkbpjgahf
UNIX>
The file?tree-3.txt?has the output from?random_tester_1. I can use it to make another call to?random_tester_1?that attaches to the tree that it just created. The input file tells me what's in the tree (keys, vals and LBA's). Now it generates new events (using the old keys and the new keys for finding):
UNIX> random_tester_1 1 10 50 30 tree-3.jdisk tree-3.txt -
I ndbqtrkuwgsgmilthoqwkhsvhxerjetjbcxzakw ctvztvyrhnbig
I hwe delscbm
F auaerduagfi
F dmkaaoptzhmyszbbybrfzfqyyfshbq
F yh
I murekjom mxv
I epunbavuprsytivhufivhxhpt j
F jblrqvbhhtssacoxqeksrxosrhnhpeiqmxjjaxrfiue
F riosrojnkzqcyqgkasvfwybfiqgpvfzacsfbcodp
I wdxszkdgjxqyakiqlweuah lt
Reads: 26
Writes: 26
UNIX>
You'll note, it called?b_tree_find()?on "yh", which was in the old tree, and made sure that the LBA and val for that key is what it was when we inserted it on the first call to?random_tester_1.
random_tester_1?will be a good way for you to test that your btree programs and mine are interoperable. For example, you can call it once with my version of?random_tester_1, and then again on your version with the input generated from the first call. They should both work together!
Life without a net:?random_tester_2
The parameters to?random_tester_2?are similar to?random_tester_1:
random_tester_2 seed nevents key_size tree_file input_file output_file
The only parameter that is missing is?val_size.?random_tester_2?now generates random keys that are exactly?key_size?bytes, and they can be?any?bytes. No longer are they strings. It also generates random vals that are exactly JDISK_SECTOR_SIZE bytes. This is the ultimate test for your programs, because you can't debug with nice strings. You'll have to debug with a little moxy.
The input and output files are now binary.
b_tree_dcs
I have a program called?b_tree_dcs, which stands for "B-Tree Double-Checker and Serializer." It takes a B-Tree file as its command line argument, and the first thing that it does is double-check that it is a valid B-Tree. You may find that useful for debugging (I did). After that, it prints all the keys out in sorted order, and then all the vals in the same order as the keys. If a key or val is a string followed by all null characters to fill out the buffer, then it prints the key or val with "S" and a string. Otherwise, it prints "H" and all of the bytes in hex. In that way, you can actually look at the trees of?random_tester_2?without resorting completely to?jdisk_test?or VI:
UNIX> b_tree_dcs tree-2.jdisk
Key_Size: 200
Key 0: S Alexis
Key 1: S Allison
Key 2: S Caleb
Key 3: S Daniel
Key 4: S Eli
Key 5: S Grace
Key 6: S James
Key 7: S Landon
Val 0: S Gyroscope
Val 1: S Embellish
Val 2: S Sudanese
Val 3: S Shadowy
Val 4: S Eider
Val 5: S Procter
Val 6: S Chug
Val 7: S Delinquent
UNIX> rm -f tmp.jdisk; random_tester_2 0 2 25 tmp.jdisk - - > /dev/null
UNIX> b_tree_dcs tmp.jdisk
Key_Size: 25
Key 0: H A95E509709556B8137127CE6160C538C53E4488F4B083ADA9A
Key 1: H C00A91D0F269AEBAB4483B0B5CEA668381C81552B9297D3A7D
Val 0: H 7824B141A5586A73755C3CFC645E2F4FC454332A25B44DE8CFCDEC636CC ...
Val 1: H 5AD463298B1EBB204624070D8C848B448EE15246053A1B632E59E07A666 ...
UNIX>
The Grading Script, in grader.sh
You asked for a grading script, so I have provided. In?grader.sh, you specify two command line arguments: A number and Y|N. The second argument says whether the script will delete its files or not. The first number can be any number. The script will create two jdisk files:?test-answer.jdisk?and?test-$USER.jdisk. The first one is created by the programs in the lab directory. The second one is created using the programs in the current directory. When the jdisk files are created, they are compared with?b_tree_dcs, and if the outputs match, the problem is correct.
For example, gradescript number 1 uses?b_tree_test?to insert one key into the B-tree file:
UNIX> sh /home/plank/cs494/labs/Lab-2-B-Tree-Btree/grader.sh 1 N
1
Problem 1 Correct.
tree-answer.jdisk and tree-plank.jdisk created as follows:
----------------------------------------------------------
rm -f tree-plank.jdisk tree-answer.jdisk
/home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk CREATE 51200 26 < input.txt > /dev/null
./b_tree_test tree-plank.jdisk CREATE 51200 26 < input.txt > /dev/null
UNIX> cat input.txt
I Mackenzie Malignant
UNIX> b_tree_dcs tree-answer.jdisk
Key_Size: 26
Key 0: S Mackenzie
Val 0: S Malignant
UNIX> b_tree_dcs tree-plank.jdisk
Key_Size: 26
Key 0: S Mackenzie
Val 0: S Malignant
UNIX>
Gradescript number 2 is a little more complex -- it creates an input file with 161 random insertions (some of which will replace keys). It calls the?b_disk_test?program in the lab directory to create?tree-answer.jdisk, from the first 137 entries of the input file. Then it copies?tree-answer.jdisk?to?tree-$USER.jdisk, and processes the remaining 24 insertions using the lab program and your program:
UNIX> sh /home/plank/cs494/labs/Lab-2-B-Tree-Btree/grader.sh 2 N
2
Problem 2 Correct.
tree-answer.jdisk and tree-plank.jdisk created as follows:
----------------------------------------------------------
rm -f tree-plank.jdisk tree-answer.jdisk
head -n 137 input.txt | /home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk CREATE 614400 129 > /dev/null
cp tree-answer.jdisk tree-plank.jdisk
sed 1,137d input.txt | /home/plank/cs494/labs/Lab-2-B-Tree-Btree/b_tree_test tree-answer.jdisk > /dev/null
sed 1,137d input.txt | ./b_tree_test tree-plank.jdisk > /dev/null
UNIX> wc input.txt
161 483 2807 input.txt
UNIX> head input.txt
I Blake Shelter
I Mackenzie Effluvium
I Brianna Reluctant
I Evelyn Vigilantism
I Mason Drawn
I Anthony Singlet
I Anna Microjoule
I Kate Howell
I Paige Tinder
I Xavier Roulette
UNIX> tail input.txt
I Peyton Tammany
I Caleb Abe
I Landon Knick
I Makayla Fortescue
I Brody Repeat
I Nathan Swizzle
I Madison Nonetheless
I Ava Astronautic
I Zoey Indebted
I Chloe Smirk
UNIX> b_tree_dcs tree-answer.jdisk | head
Key_Size: 129
Key 0: S Aaron
Key 1: S Abigail
Key 2: S Addison
Key 3: S Aidan
Key 4: S Aiden
Key 5: S Alex
Key 6: S Alexander
Key 7: S Alexis
Key 8: S Allison
UNIX>
The tests performed by the gradescript are based on the number given mod ten. The tests are as follows:
?Number mod 10 = 1: 1 to 15 random insertions using?b_tree_test. Key size is between 15 and 50.
Number mod 10 = 2: 1 to 150 random insertions using the lab directory's?b_tree_test. Key size is between 15 and 214. Then 1 to 150 more random insertions using?b_tree_test.
Number mod 10 = 3: Calling?random_tester_1?with the gradescript number as the seed. 1 to 20 events; Key size between 15 and 214; Val size between 1 and 200.
Number mod 10 = 4: Calling?random_tester_1?with the gradescript number as the seed. 1 to 500 events; Key size between 15 and 252; Val size between 1 and 1023.
Number mod 10 = 5: Calling?random_tester_2?with the gradescript number as the seed. 1 to 500 events; Key size between 15 and 252.
Number mod 10 = 6: Calling?random_tester_2?with the gradescript number as the seed. 1 to 5000 events; Key size between 4 and 10.
Number mod 10 = 7: Calling?random_tester_2?with the gradescript number as the seed. 1 to 5000 events; Key size = 254.
Number mod 10 = 8: Calling?b_tree_test?to create a b-tree of appropriate size. Calling?random_tester_2?in the lab directory to with up to 5000 events, key size = 254. The resulting tree is then copied to?tree-$USER.jdisk, and your?random_tester_2?is called for another 1 to 5000 events.
Number mod 10 = 9: Same as test 8, only now the key size is between 4 and 10.
Number mod 10 = 0: Same as test 9, only now the key size is between 4 and 254.
Each of these can take a few seconds, because the files can be in the 10's of megabytes.
Some help
You don't need this section, but you probably should read it. My internal btree struct was as follows:
typedef struct {
int key_size; /* These are the first 16/12 bytes in sector 0 */
unsigned int root_lba;
unsigned long first_free_block;
void *disk; /* The jdisk */
unsigned long size; /* The jdisk's size */
unsigned long num_lbas; /* size/JDISK_SECTOR_SIZE */
int keys_per_block; /* MAXKEY */
int lbas_per_block; /* MAXKEY+1 */
Tree_Node *free_list; /* Free list of nodes */
Tree_Node *tmp_e; /* When find() fails, this is a pointer to the external node */
int tmp_e_index; /* and the index where the key should have gone */
int flush; /* Should I flush sector[0] to disk after b_tree_insert() */
} B_Tree;
Note how the first three fields are such that you can write them to sector 0 (and read them from sector 0). As it turns out, I write the whole struct to sector 0, but the remaining bytes are ignored.
A?Tree_Node?is the internal representation of a node. Here's its struct:
typedef struct tnode {
unsigned char bytes[JDISK_SECTOR_SIZE+256]; /* This holds the sector for reading and writing.
It has extra room because your internal representation
will hold an extra key. */
unsigned char nkeys; /* Number of keys in the node */
unsigned char flush; /* Should I flush this to disk at the end of b_tree_insert()? */
unsigned char internal; /* Internal or external node */
unsigned int lba; /* LBA when the node is flushed */
unsigned char **keys; /* Pointers to the keys. Size = MAXKEY+1 */
unsigned int *lbas; /* Pointer to the array of LBA's. Size = MAXKEY+2 */
struct tnode *parent; /* Pointer to my parent -- useful for splitting */
int parent_index; /* My index in my parent */
struct tnode *ptr; /* Free list link */
} Tree_Node;
I maintain my own free list of?Tree_Node?structs. When I allocate one, I do three?malloc()?calls. One is for the?Tree_Node?itself, one is for?keys?and one is for?lbas. I set the values of?keys?right after I allocate them, since they point to fixed locations in?bytes. For example,?keys[0]?will point to?bytes+2.?keys[1]?will point to?bytes+2+key_size. Etc. The size of?keys?is?MAXKEY+1. Moreover, the size of?lbas?is?MAXKEY+2. The reason is that in my internal representation of a B-Tree node, I am allowed to store an extra key and val. This simplifies the implementation of splitting in an?enormous?way. If you don't believe me, ask anyone in the 2015 version of CS494 who tried to implement B-Trees without it. When I'm done with a?Tree_Node, I put it onto my free list. That way, if I need a new?Tree_Node?later, and there's one on the free list, I don't have to do any?malloc()?calls -- I simply grab it from the free list.?Keys?and?lbas?are already allocated, and the values of?keys?are already set correctly. That's convenient.
I have a procedure which reads a?Tree_Node?from a jdisk. It reads it into?bytes, and then it copies the LBA's from the end of the sector into?lbas?(using?memcpy). It has to do this, because it uses the end of the sector for that extra key. It also reads?nkeys?and?internal?from the sector. A convenient thing is that the?keys?pointers are already pointing to the correct place, so you don't need to do anything special with the keys.
I set the?flush?field whenever I modify a B-Tree node, and the modified node needs to be flushed to disk. I do the final flushing at the end of?b_tree_insert(). When I flush a node, I need to create the sector. I do this by copying?nkeys?and?internal?into their proper place in?bytes. I then copy?lbas?to their proper place in?bytes. The keys are already in their proper place. I then write?bytes?using?jdisk_write().
What I did for splitting was as follows -- I'd go ahead and insert the key/lba. Since there's room for an extra key and lba in the?Tree_Node, this works even when the node is full. I then checked to see if the node needed to be split, and if so, I called a procedure called?split_node. This procedure has to be recursive, BTW.
When I call?find(), I have it read in all of the nodes on the path to the external node, setting their parent fields, but setting?flush?to 0. The?flush?field is set by?b_tree_insert()?when the node changes. The parent fields are used on splitting. When I'm done with?b_tree_insert()?or?b_tree_find(), I call?free_and_flush()?on every node from the external node up to the root. This flushes the node if?flush?is set, and puts the node onto the free list. I found?free_and_flush?super-helpful.
b_tree_instrument.o: Instrumented code.
In?b_tree_instrument.o, I have added code that prints my?Tree_Node?just after reading, and just before flushing. I have compiled it with?b_tree_test?to make the executable?b_tree_test_inst. This may help you figure out how my pointers work. Here's an example, where I insert a new node into?tree-2.jdisk. First, here's what I'm doing:
UNIX> cp tree-2.jdisk tmp.jdisk
UNIX> echo P | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
LBA 0x00000008. Internal: 1
Entry 0: Key: Eli . LBA: 0x00000001
Entry 1: . LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: Key: Alexis . LBA: 0x00000004
Entry 1: Key: Allison . LBA: 0x0000000b
Entry 2: Key: Caleb . LBA: 0x0000000a
Entry 3: Key: Daniel . LBA: 0x00000002
Entry 4: . LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: Key: Grace . LBA: 0x00000005
Entry 1: Key: James . LBA: 0x00000009
Entry 2: Key: Landon . LBA: 0x00000003
Entry 3: . LBA: 0x00000000
Reads: 4
Writes: 0
UNIX> echo I A-Aron Stoae | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
Insert return value: 12
Reads: 3
Writes: 5
UNIX> echo P | b_tree_test tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
LBA 0x00000008. Internal: 1
Entry 0: Key: Allison . LBA: 0x00000001
Entry 1: Key: Eli . LBA: 0x0000000d
Entry 2: . LBA: 0x00000007
LBA 0x00000001. Internal: 0
Entry 0: Key: A-Aron . LBA: 0x0000000c
Entry 1: Key: Alexis . LBA: 0x00000004
Entry 2: . LBA: 0x0000000b
LBA 0x0000000d. Internal: 0
Entry 0: Key: Caleb . LBA: 0x0000000a
Entry 1: Key: Daniel . LBA: 0x00000002
Entry 2: . LBA: 0x00000006
LBA 0x00000007. Internal: 0
Entry 0: Key: Grace . LBA: 0x00000005
Entry 1: Key: James . LBA: 0x00000009
Entry 2: Key: Landon . LBA: 0x00000003
Entry 3: . LBA: 0x00000000
Reads: 5
Writes: 0
UNIX>
Now, here are the tree nodes as they are read in the beginning of the insertion, and as they are flushed after the insertion:
UNIX> cp tree-2.jdisk tmp.jdisk
UNIX> echo I A-Aron Stoae | b_tree_test_inst tmp.jdisk
Attached to tmp.jdisk. FS: 20480 - KS: 200
Find(): Read -- Tree Node 0x1e840c0
Bytes: 0x1e840c0
Nkeys Address: 0x1e845c0
Nkeys Value: 1
Flush Value: 0
Internal Value: 1
LBA Value: 8
Keys: 0x1e84600 keys is allocated with malloc().
Keys[0] 0x1e840c2 (Eli) This is bytes+2
Keys[1] 0x1e8418a (Landon) This is bytes+2+key_size. Since nkeys equals 1, the contents here are
meaningless. They happen to hold "Landon" from a previous time,
when "Landon" was the second key. You have to ignore this, because
Nkeys is one.
Keys[2] 0x1e84252 () This is bytes+2+key_size*2
Keys[3] 0x1e8431a ()
Keys[4] 0x1e843e2 () This is the extra key.
Lbas: 0x1e84630 lbas is allocated with malloc().
Lbas[0] 0x1 Lba of the internal node before "Eli".
Lbas[1] 0x7 Lba of the internal node after "Eli".
Lbas[2] 0x0 The value of these is meaningless. They happen to be zero.
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0 This is the extra lba
Parent 0x0
Parent_index -1
&ptr 0x1e845e8 This is meaningless, because the tree_node isn't on the free list.
Find(): Read -- Tree Node 0x1e84650
Bytes: 0x1e84650
Nkeys Address: 0x1e84b50
Nkeys Value: 4
Flush Value: 0
Internal Value: 0
LBA Value: 1
Keys: 0x1e84b90
Keys[0] 0x1e84652 (Alexis)
Keys[1] 0x1e8471a (Allison)
Keys[2] 0x1e847e2 (Caleb)
Keys[3] 0x1e848aa (Daniel)
Keys[4] 0x1e84972 ()
Lbas: 0x1e84bc0
Lbas[0] 0x4
Lbas[1] 0xb
Lbas[2] 0xa
Lbas[3] 0x2
Lbas[4] 0x6
Lbas[5] 0x0
Parent 0x1e840c0
Parent_index 0
&ptr 0x1e84b78
Free_and_flush(): Write -- Tree Node 0x1e84be0
Bytes: 0x1e84be0
Nkeys Address: 0x1e850e0
Nkeys Value: 2
Flush Value: 1
Internal Value: 0
LBA Value: 13
Keys: 0x1e85120
Keys[0] 0x1e84be2 (Caleb)
Keys[1] 0x1e84caa (Daniel)
Keys[2] 0x1e84d72 ()
Keys[3] 0x1e84e3a ()
Keys[4] 0x1e84f02 ()
Lbas: 0x1e85150
Lbas[0] 0xa
Lbas[1] 0x2
Lbas[2] 0x6
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0
Parent 0x0
Parent_index 0
&ptr 0x1e85108
Free_and_flush(): Write -- Tree Node 0x1e84650
Bytes: 0x1e84650
Nkeys Address: 0x1e84b50
Nkeys Value: 2
Flush Value: 1
Internal Value: 0
LBA Value: 1
Keys: 0x1e84b90
Keys[0] 0x1e84652 (A-Aron)
Keys[1] 0x1e8471a (Alexis)
Keys[2] 0x1e847e2 (Allison) These are leftover from before the split. nkeys is 2,
Keys[3] 0x1e848aa (Caleb) so they are ignored, but they are written to disk.
Keys[4] 0x1e84972 (Daniel) As you can see there is room for 5 keys, which makes splitting easier.
Lbas: 0x1e84bc0
Lbas[0] 0xc
Lbas[1] 0x4
Lbas[2] 0xb
Lbas[3] 0xa Same with these LBA's.
Lbas[4] 0x2
Lbas[5] 0x6
Parent 0x1e840c0
Parent_index 0
&ptr 0x1e84b78
Free_and_flush(): Write -- Tree Node 0x1e840c0
Bytes: 0x1e840c0
Nkeys Address: 0x1e845c0
Nkeys Value: 2
Flush Value: 1
Internal Value: 1
LBA Value: 8
Keys: 0x1e84600
Keys[0] 0x1e840c2 (Allison)
Keys[1] 0x1e8418a (Eli)
Keys[2] 0x1e84252 ()
Keys[3] 0x1e8431a ()
Keys[4] 0x1e843e2 ()
Lbas: 0x1e84630
Lbas[0] 0x1
Lbas[1] 0xd
Lbas[2] 0x7
Lbas[3] 0x0
Lbas[4] 0x0
Lbas[5] 0x0
Parent 0x0
Parent_index -1
&ptr 0x1e845e8
Insert return value: 12
Reads: 3
Writes: 5
UNIX>
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。