Dragon „Hackst“ IV

On this page I will incrementally collect all the information what I currently (2020-09-14) found out looking in Dragon Quest IV Playstation 1 Remake (ドラゴンクエストIV 導かれし者たち) data. If I can not continue this project, it may help others to get into it. You can contact me for questions, collaborations or hints.

Related source code is on GitHub. These is also a related romhacking.net forum post.

Getting the data

I took the japanese game from coolrom. Using cebix’s psximager, the *.bin file can be extracted. You will find three files on the disk:

  • SYSTEM.CNF (68 Bytes) – just a small config what executable will be started
  • SLPM_869.16 (692.2 KB) – the PS-X EXE executable
  • HBD1PS1D.Q41 (319.4 MB) – the game’s resources

The company Heart Beat Inc. both implemented Dragon Quest IV (DQ4) and Dragon Quest VII (DQ7) for the PlayStation. This is why HBD1PS1D.Q41 could mean Heart Beat Disc/Data 1 for PlayStation 1 Dragon Quest 4. Understanding this archive would allow to extract resources and also could help to translate it from japanese to english. Similarly, DQ7 has the file HBD1PS1D.Q71.

Other attempts in the past

In 2008 the user Kojiro started an initiative to translate Dragon Quest IV. However, the site is down and it seems that there was no progress. Using the Wayback Machine it is maybe helpful to look into their forum.

In 2010 the question of a translation was stated in the Dragon Den’s Forum, but without any results.

The user loveemu extracted in 2014 from the HBD1PS1D.Q41 file the music. See also the user’s Dragon Quest VII: Sound Engine Analysis. The tool psdq7rip that extracts the sound files from HBD1PS1D file just scans for sound data and does not extract the archive completely.

In 2000, Tonura mentions that in the DQ7 file there are monster images at a certain position but compressed with LZ algorithm. The Game Lab magazin (ゲームラボの記事) is mentioned here. Seems to be that the October 2000 volume is the right one.

Understanding HBD1PS1D file

HBD1PS1D.Q41 has a size of 319436800 bytes. It perfectly divides by 2048 byte blocks: 319436800 bytes file size / 2048 bytes = 155975 blocks. In fact, when we visualize each byte of the file as a gray pixel on a 2048 x 155975 bitmap, we see certain patterns:

The pattern shows that some resources consist of more than one 2048 bytes block. We call them * 00 00 00 blocks, since they always start with this pattern. In the middle of the file are white spots: these resources have another header than the other resources. We call them 0x60010108 blocks, since they always start with this pattern.

The first 2048 bytes

Hex view of the first 2048 bytes

The very first 2048 bytes of the HBD1PS1D (the first block) is different to the following blocks. It is noticeable that the ASCII string „hdb1ps1d.q41“ is exactly at position 0x400 (1024). Maybe we see here two 1024 byte blocks. There are rarely 0x00 bytes which is why I guess that we do not see short or int numbers here. Decompressing does not show any good data and a check with several japanese text encodings does also not show text.

When we compare DQ4 and DQ7, we see that the first 2048 bytes (until address 0x800) is nearly identical. It only is different in the string „hdb1ps1d.q{4,7}1″. That means, that this very first block is not dependent on the different data these games have.

Comparison between DQ4 and DQ7 beginning of HBD1PS1D file. In the picture we see it starting from address 0x740.

The * 00 00 00 blocks

These are data blocks it seems. The name comes from the fact that they always start with a integer that is small, because it states the number of sub-blocks (at max maybe 18). This * 00 00 00 block’s header is 16 bytes in length. The integers and shorts are little-endian.

StartLength (bytes)Comment
0x004The number of sub-blocks this block has.
0x044The number of 2048 byte sectors the block consists of.
0x084The total data length (raw data without the header information).
0x0c4Always zero. Maybe the previous is not an integar, its a long value.
The * 00 00 00 block header (16 bytes).

At the beginning of a 2048 byte block this header tells us, how big the block truely is. The total data length is awalys smaller and filled with 00 bytes until a 2048 byte sector is completed. I guess this is because of reading performance. There are 3243 of these blocks when we read through the whole file. But this block consists of sub-blocks.

The * 00 00 00 sub-blocks

Each sub-block is described with a 16 byte header. Thus, we first have to read <number of sub-blocks> times the following header information. The i-th header information (zero-indexed) is at 0x10 + (i * 16 bytes).

StartLength (bytes)Comment
0x004The data length of this sub-block. If the parent block has only one sub-block, this has the same size as the parent block.
0x044If the data is compressed, this is the uncompressed data length. If it is not compressed, than the previous integer is the same.
0x084Unknown
0x0c2It seems some flags. Most of the time 0, but in 25% of the cases 1280. If it is 1280, then uncompressed data length is always bigger than the data length. Thus, I assume that this indicates if the sub-block’s data is compressed.
count | prop. | decimal value
18001 | 75,546% | 0
5821 | 24,429% | 1280
0x0e2Also some flags or maybe a type information for the sub-block. It can be of the following values: [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]. E.g.: in 21 we always find qQES data.
The sub-block header (16 bytes).

After the main block header (16 bytes) and the sub-block headers (a multiple of 16 bytes) comes the raw data. The sub-block’s data length sum is equal to the main blocks total data length. For each sub-block we can extract its data array.

Writing the blocks with their sub-blocks (using tab) as a tree and also providing an index for each block at the beginning, it looks like the following. I also searched for the „pQES“ substring in the sub-block’s data and marked the sub-block as compressed when 0x0005 value is present at 0x0c. Maybe each block is a scene or level with sub-block data to render it.

Sub-Block Types

TypeCount [with duplicates](Proportion) / Count DistinctCompressedComment
16 (0.03%) / 2noFont Images
61730 (7.26%) / 691yesmulti-image, Chipset Images, maybe textures
71458 (6.12%) / 518noHas some patterns but maybe is not an image, (width=20)
8309 (1.30%) / 272yesMonster sprites and battle effects images, (width=128 or 256), TIM files with header 0x10000000 09000000 or 08000000
9473 (1.99%) / 444yesSeems to be images (width=512 maybe), looks like gradients
10256 (1.07%) / 221yesMulti-Image data, monster sprites, multi-TIM with header 0x10000000 09000000
11309 (1.30%) / 201yesUnknown, sometimes contains character sequences at the start (e.g. from a – z), maybe ints that are counted up (hex editor width=8)
12309 (1.30%) / 286nodata has these white (0xff) patterns (width=32)
13970 (4.07%) / 375yescharacter (NPC) sprites images, (width=128), starts with 0x0c000000
1444 (0.18%) / 35yescharacter (NPC) sprites images, starts with 0x0c000000
152 (0.01%) / 1yesbigger character (NPC) sprites images (width=128)
175 (0.02%) / 1yeshorse sprites images, starts with 0x0c000000
185 (0.02%) / 1yessprites images, starts with 0x0c000000
19141 (0.59%) / 96yesbattle effects (fire, slashes and so on) images
203 (0.01%) / 1noqQES format
213317 (13.92%) / 278noqQES format
2272 (0.30%) / 2noqQES format
2344 (0.18%) / 12nomaybe image data (has these vertical black lines)
241062 (4.46%) / 485noqQES format
2527 (0.11%) / 3yesImage, background texture, in tiles, width=256
26573 (2.40%) / 454nosmall size, has also a bit these 0xff patterns, maybe level data? (hex width=4), numbers counting down
311025 (4.30%) / 885nosmall size, patterns, maybe level data?, some differ only by small changes
3232 (0.13%) / 8nostarts sometimes with printline output text and has japanese text, e.g. シナリオ (path=350/6), start with 0x081f0180
34975 (4.09%) / 957nosmall size, patterns, maybe level data?, some differ by small changes only
351576 (6.61%) / 583nosmall size, patterns (hex editor width=4)
361506 (6.32%) / 238nosmall size, patterns (hex editor width=4), rows starts with 0x0*00
371033 (4.34%) / 298nosmall size, patterns (hex editor width=2)
381377 (5.78%) / 392nosmall size, nearly same patterns
39976 (4.10%) / 927yes (but some are not)has these CCC (0x00434343) patterns often
401315 (5.52%) / 893nomaybe multi data
411730 (7.26%) / 240nosmall size, mostly 0x00, small diffs
42213 (0.89%) / 213nounknown
4324 (0.10%) / 23yessprite images, starts with 0x0c000000
44152 (0.64%) / 29no (but some are)has sometimes error messages, maybe scripts? (path=26022/8 -> エンカウントOFF) , (path=637/2 -> job names in japanese), (path=350/11 -> こうげき [attack]), (path=26024/17 -> RIGHT), (path=26027/4 -> メモリーカード), (path=26028/3 -> start menu script?), (path=26022/0 -> battle magic names?)
45140 (0.59%) / 88nosome have „{buki,majinyobi} open NG“ ascii header
46612 (2.57%) / 118yes (but some not)contains error messages and japanese text, maybe also messages, start always with 0xe8ffbd27, contains japanese text of inn npc, shop npc etc. some have DQ41章2章 etc. ,
path=596/8 -> MPが  ふえたHP
4727 (0.11%) / 1nocontains error messages: c a n ‚ t . g e t . n e w _ f m a p ! ! ( % d ) ( m a x = % d )

The 0x60010108 blocks

These blocks make the white spots explained with the gray image above. They are always 2048 bytes in size and start with 0x60010108. In the image they are often „white“, because their data contain trailing 0xFF bytes.

The header seems to be 32 bytes in length.

StartLength (bytes)Comment
0x004Always the 0x60010108 „magic number
0x042An index ranging always from 0 to 4, +1 per block.
0x062Seems to be a count number which is always 5. The previous number loops from 0 to 4, thus has five values.
0x084A kind of part number that counts up after the index at 0x04 reaches 0 again. (1-indexed).
0x0c4An integer
0x102Always 0x8000 which is 128
0x122Always 0x7800 which is 120
0x144Unknown number, maybe flags? Forth byte is always 0x38
0x182Can be 0x0100, 0x0200 or 0x0300
0x1a4Always 0x03000000
0x1e2Always 0x00
The 0x60010108 block header (32 bytes)

There are 26635 of these blocks. When we put the parts (at 0x08) together, for example, from part 1 to 195, we get 40 entries.

I checked: it is not japanese text in some japanese encodings.

Maybe we see here some soundfonts, sound effects or wave files?

Find Japanese Texts

The Japanese Industrial Standards (JIS) tell us, how the japanese text can be encoded in data. It seems to be that the Shift-JIS standard is used in Dragon Quest where every letter has 2 bytes. You can spot Hiragana in the data by looking at two bytes where the first is 0x82. If this happens very often, there is a hiragana sequence.

In the first chapter there is already some dialog. This can help to find the position of the text in the data, if it is not compressed. I just set the hero’s name to „ああああ“.

The dialog is:
どうした? <Heroname>。
もう降参かい?
そうだな。今日は このくらいに
しておこう……。
私の役目は はやく お前を
一人前に 育てることだが
あせっても しかたあるまい。
ちて もどるとするか。
<Heroname>も 家で ゆっくり
休むといいだろう。
勇者さま 勇者さま……。
勇者さま どうか たすけて……。

Unfortunately, words of the text above can not be found using typical japanese encodings in the non-image blocks. I assume, that the dialog texts are encoded in a non-standard way. It could be possible, that it is a coding that is conform to the font-image that is used in the game. In Type 1 the font images can be found.

How letters are rendered in DQ4 using a font image. On the right the font image that is stored in VRAM.

The grayscale representation of the font image. This is just the byte representation. It has to be parsed correctly.

Understanding the font images

These two images are used in the VRAM. I guess that the font image we found in the data is compressed but not in LZS (maybe run length encoding?). If we would decompress it, we would receive the two images.

The alphabet starts not at the left-top corner. It starts at 14th and 15th letter. Starting from these positions, there are always 32 * 8 = 256 letters. Thus, a byte would be enough to address them per image.

It is interesting to note that it starts with ‚z‘ (left image) and continues with ‚y‘ (right image), so you have to alternate between the images to get the alphabet in reversed order.

Shift-JIS table.

Comparing it with Shift-JIS works quite good: always alternating, it starts from z-a, then from Z-A, then from 9-0, then from ん-ぁ, then the katagana. Thus, they rendered their font-image based on this ordering but in reversed order and with two images.

LZS Compression Scheme

Some subblock data is compressed with the LZS algorithm. Using this helpful documentation it seems I guessed the LZS scheme correctly.

The buffer has a size of 4096 bytes. The control byte has to be read bit-wise from right to left. A 1-bit indicates a literal byte. A 0-bit indicates a reference. The reference is 2-bytes in length, e.g. 1110101111110000 . The last four bits describe the length. To the length has to be added +3 to get the correct length. In our example 0000 is the length which is 0 in decimal value, plus 3 it is the correct length of 3. The offset is combined in the following way: The remaining three parts p1=1110 p2=1011 p3=1111 are combined to p3 p1 p2 which creates: 1111 1110 1011 . The decimal value of our example offset is 4093 (little-endian). To the offset has to be added a value of +18, so we get the correct offset of 4093. Since our buffer has a size of 4096 and our example length is 3 bytes, the last three bytes of the buffer are written to the output stream, in the given example. Maybe it can happen that the offset value overflows the buffer size which is why you should put % 4096 (modulo the buffer size). Read until the compressed data is read completely.

The algorithm is implemented in the DQLZS class.

Doing this for subblocks of type 8 we can decompress the TIM files without an error.

A decompressed TIM file of DQ4.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.