;===============================================; ;Dual-Tile Encoding: ; ;NES/Famicom Implementation ; ;version 1.01 ; ;===============================================; ;Written by RedComet ; ;redcomet@rpgclassics.com ; ;http://www.rpgclassics.com/subsites/twit/ ; ;===============================================; ;===============================================; ;Table of Contents ; ;===============================================; ;-Version History ; ;-Introduction ; ;-What is DTE? ; ;-Finding the Text Routine ; ;-Implementing DTE in an NES game ; ;===============================================; ;=======================================; ;Version History ; ;=======================================; ;Version 1.01: ; ;2-25-06 ; ;Second release. I rewrote some of the ; ;Implementation and Finding the Text ; ;Routine sections to be a bit more ; ;user friendly. Thanks go out to I.W. ; ;for all the suggestions and advice. ; ;=======================================; ;Version 1.00: ; ;12-11-05 ; ;Initial release. If there's demand, ; ;this will be expanded upon further. ; ;=======================================; ;=======================================; ;Introduction ; ;=======================================; Dual-Tile Encoding (commonly referred to as DTE in romhacking circles) has to be one of the easiest things you can do to a game to gain space, yet, like most assembly stuff, it goes all but undocumented! I'm hoping to rectify that with this document. The purpose of this document is to provide you with information and examples of what DTE is and how to successfully implement it in a NES/Famicom rom. The below method is how I do it, and it may work on other systems with some tweaking. In future updates, I hope to address any questions, problems, or suggestions readers might have. The point of this is to help you, so email me if it's not working at redcomet@rpgclassics.com Please be a dear and visit my site for this and (hopefully) more documents and information on various on-going video game translations at: http://www.rpgclassics.com/subsites/twit/ Before going any further, I want to let you know that basic romhacking (tables, strings, pointers and the like) and rudimentary assembly programming knowledge is assumed; if you're new to romhacking, this document isn't for you. As for the assembly part, just make sure you understand the concepts of low-level programming and, coupled with a few reference documents for the NES, you should be fine. Now, onto the story... ;=======================================; ;What is DTE? ; ;=======================================; DTE stands for Dual-Tile Encoding. This means that one hex value can represent two characters at the same time. Consider the following example: Say you have a table as follows: 0A=a 0B=b 0C=c 0D=A 0E=. 0F= And you had the following string: A cab. Using the given table, this would be stored in ROM as: 0D 0F 0C 0A 0B 0E - 6 bytes. Now, if we were to implement the following DTE values to the game so that are table was as follows: 0A=a 0B=b 0C=c 0D=A 0E=. 0F= 10=A 11=ca 12=b. We could then store the string as: 10 11 12 - 3 bytes; half of what the non-DTE string. ;=======================================; ;Finding the Text Routine ; ;=======================================; I'm including this here, as it's both relevant and sorely undocumented. Please note, this is my method of finding text routines in NES/Famicom games using FCEUXD and it assumes the game in question uses a typical pointer system (for more information on pointers, consult MadHacker's document on the subject). There's a few ways you can go about finding the text routine with FCEUXD; the following is the method I've had the most success with. Please note that it is not only possible, but highly probable that any two games will have distinctly different methods or reading text - some may even have compressed text. In which case, you're on your own. That's way beyond the scope of this document. Anyway, fire up the rom and bring up a string that you can get to fairly easily (I like the very first string displayed after starting a new game). Find this string in rom using your prefered method. Then, calculate and find the pointer for this string; test it to make sure. Now open up the Debugger and set a Read Breakpoint for the address pointed to. Reset the game and play up to the point when the text is read and the Debugger should snap. When it does, you'll be "standing" right before the data pointed to by the pointer is read. Note: The pointer will almost exclusively be stored in the Zero Page ($00-$FF) in order for the indirect indexed addressing to work. There are exceptions to this, but you're only going to encounter them when dealing with a processor that features a movable Zero Page, like the 65816. Example: Say the pointer $84A9 is stored at $20-21 in ram. That is, at $20 in ram is $84 and at $21 is $A9. The pointer_address is the address in ram where the pointer is stored ($20-21 here). Let's assume that the following statement is used to read a byte of text: lda ($20),y The Y register is used exclusively for indirect indexed addressing, so you won't have to worry about the X register being used. Note: If you don't understand why the above is lda ($20) and not lda ($21), look up Indirect Indexed Addressing for clarification. Using the above method, we would open the debugger and set a Breakpoint for when $84A9 is Read. Then we would reset the game (if necessary) and play until the text is displayed. Once $84A9 is read, the debugger would snap right before lda ($20),y is executed - inside the routine. From here, you can set breakpoints, use the Trace Logger to dump the routine as it is executed, Step through each instruction, or you can disassemble the rom and study the routine that way. I usually disassemble the bank the code is in and use a combonation of the above methods to study the code and see how it works. That should just about cover one method of finding the text routine. If you know of another way, let me know and I'll be more than happy to include it here. ;=======================================; ;Implementing DTE in an NES game: ; ;=======================================; Now that you've located the text routine, you need to spend some time studying it; you want to understand how it works at an almost intimate level. From there you can determine what hex values are neither control codes nor actual characters. I prefer a range of unused values (like $80 through $A0 instead of $80-$89 $90-A0 with $8A-$8F being used for other things already). Of course, if you have, say, only one or two unused values here and there (or even characters that aren't used at all), you can use them, but I feel a range of values makes coding and most things in general easier to manage. Anyway, once you've found that, you can begin to work on the assembly side of things. Now, you're going to want to study the game's original, unaltered text read routine until you know how everything (or almost everything) works; this can save you a lot of time debugging later. What you're going to do is overwrite the initial text read with a JuMP to the new DTE code. Alternatively, if you're working with a Japanese game and translating it to English, you can usually just replace the Dakuten/Handakuten code & table with the DTE code & table fairly easily. One other method of adding the DTE would be to rewrite the original routine and place it inline with the rest of the code. Note: Below I assume that control codes are checked for following a byte of text being read. If the game you're working on doesn't do this, it's ok. It's included to give you a better idea of how the new DTE routine fits into the grand scheme of things. Here's an example (first old and then the new): ;[old] text_read: ldy index ;Retrieve the text index from ram. lda (pointer),y ;Load the current byte of text to work with. control_code: cmp control_code ;Here we see check to see if the byte read is a bne not_cc ;control code. If it's not, we move onto the next ;bit of code that loads the tile to be displayed or ;whatever. ;[new] text_read: ldy index ;This starts off the same as the old code. lda (pointer),y dte_check: cmp first_dte_value ;Here we see if the byte is within the range of bcc dte_end ;the DTE values. If it is, we move on to the DTE clc ;specific code that loads the characters. cmp last_dte_value ;If not, we branch back to where the original code beq dte_code ;went, and continue on as normal. bcs dte_end ;Note: The use of a beq statement following the ;cmp last_dte_value; this is necessary to allow ;the last value to be processed. ;Also note that all variables are assumed to have ;been assigned values. See, that wasn't too difficult at all. As mentioned above, in Japanese games, there'll usually be a chunk of code cmp-ing the byte read to determine whether or not it is a dakuten/handakuten character, like this: text_read: ldy index ;Same as above examples. lda (pointer),y jap_check: cmp n ;Where n equals the hex value of the first dakuten/ bcs control_code ;handakuten character. If the byte isn't one of these ;we'll continue on and check to see if it's a control ;code. Usually, you can just overwrite this block code (jap_check in the above) with your DTE check. Unless, of course, you're working on a Japanese game (or the DTE routine is bigger than the current code), in which case, you'll need to find a place where it would be easiest to perform the DTE check. Once you've got the dte_check in and working, it's time to code the meat and potatoes of the routine - the actual dte decoding! To determine whether the first character is to be displayed or the second, a test byte is needed. So, you're going to have to find a byte of ram that isn't being used by any other part of the game. It doesn't have to be Zero Page ($00-$FF), either. After you've found out, we'll look at the code: dte_code: sec ;We subtract the start value, because the first value sbc first_dte_value ;is going to be the very first entry in the look-up clc ;table. stx unused_byte ;We want to preserve the data in the X register somewhere. ;You never know when something elsewhere is going to need ;that data. ldx test_byte ;Here we load the test_byte into X. bne second_run ;You have to make a choice here: Do you want ;test_byte = 1 to signify that you want the first ;byte or the second? I let 0 signify that I want ;the first, and 1 to signify that I want the second. ;That way I get to make use of the bne statement to ;cut down on processing time and rom space. ;Note: Using the above convention, we only have to ;test the test_byte to see if it's greater than zero. ;If it's greater than zero, we know that we don't ;want the first byte. If the test_byte isn't greater ;than zero, then the branch fails and control falls ;to the first_run routine below, which will retrieve ;the first byte of the DTE pair. first_run: ldx #$01 ;We load #$01 into X, which will be stored as the ;test_byte later. This way, we'll get the second ;byte of the encoded data. dec index ;We want to decrement the text index so that we ;read the same byte again (the DTE byte) the next ;time the text read routine is executed. Otherwise, ;we won't be able to read both bytes of the DTE. asl a ;We shift the DTE value to determine the pair number. jmp get_dte ;We need to get the byte now. second_run: ldx #$00 ;This way, we reset the test_byte so we don't ;cause any problems during future DTE reads. asl a ;Here, we get the pair number, and, since we want clc ;the second byte of the pair, we add one to the adc #$01 ;accumulator (the DTE index). Okay, so that's how the test byte works. At this point, the only thing left to do is retrieve the value we want from the look-up table. You'll need to have already generated a DTE table and added it to rom for this, though. We'll tackle that next before moving on. You have a few options when it comes to generating the DTE table: you can go through each script and count the number every possible pair of characters appear by hand; you could code your own program to do this for you; or you could use one of the existing programs to do it for you. I would recommend coding your own program to do this, as this will allow you to add all the features (like using multiple scripts to determine the most common pairs of characters). Seeing as how not everyone's going to want or be able to do this, you're best bet is using an existing program (or you could do it manually, in which case, you must really hate yourself). I only know of two programs that read a single file and generate the most common pairs of letters: DTE Crunch by Klarth and DTE Table Generator by zero soul (and I'm not even sure if DTE Crunch is publically available yet). DTE Table Generator is open source, and I've had fairly decent results with it (needed to tweak and add a few features here and there, but it got the job done). I don't really remember using DTE Crunch, so I can't vouch for how good a program it is. If Klarth's other programs are any indication, DTE Crunch should be well worth your time. For further information on using these programs to generate DTE tables, consult the included documentation that comes with each program. Once the table has been generated, you need to insert it into rom. Using the "A cab." example at the beginning of this document, it would look something like this: dte_table: ;You can either insert the table manually in a .db $0D, $0F ;hex editor and write down the address to include .db $0C, $0B ;in your code, or you can tack the table in ASM .db $0A, $0E ;format as shown (.db whatever) and let your ;assembler do the work for you. If you go with the ;latter, consult your assembler's documentation to ;see if and how to define data. Now we have a DTE table, the only thing left is to write some code that can retrieve data from it. Let's do it! get_dte: stx test_byte ;Store the value of the test_byte so that we either ;get the next byte on the next run or neither. tax ;We're going to use the accumulator as the index for ;the DTE table. (That's why we performed that math ;on the DTE values.) lda dte_table,x ;Finally, we retrieve the value we want from the ;table. ldx unused_byte ;Restore the contents of the X register to what it was ;prior to the dte routine. end_dte: jmp control_code ;From there, we jump back to the original code, and ;treat the decoded byte as we would any other. There you have it. Here's what it looks like altogether: text_read: ldy index ;This starts off the same as the old code. lda (pointer),y dte_check: cmp first_dte_value ;Here we see if the byte is within the range of bcc dte_end ;the DTE values. If it is, we move on to the DTE clc ;specific code that loads the characters. cmp last_dte_value ;If not, we branch back to where the original code beq dte_code ;went, and continue on as normal. bcs dte_end ;Note: The use of a beq statement following the ;cmp last_dte_value; this is necessary to allow ;the last value to be processed. ;Also note that all variables are assumed to have ;been assigned values. dte_code: sec ;We subtract the start value, because the first value sbc first_dte_value ;is going to be the very first entry in the look-up clc ;table. stx unused_byte ;We want to preserve the data in the X register somewhere. ;You never know when something elsewhere is going to need ;that data. ldx test_byte ;Here we load the test_byte into X. bne second_run ;You have to make a choice here: Do you want ;test_byte = 1 to signify that you want the first ;byte or the second? I let 0 signify that I want ;the first, and 1 to signify that I want the second. ;That way I get to make use of the bne statement to ;cut down on processing time and rom space. ;Note: Using the above convention, we only have to ;test the test_byte to see if it's greater than zero. ;If it's greater than zero, we know that we don't ;want the first byte. If the test_byte isn't greater ;than zero, then the branch fails and control falls ;to the first_run routine below, which will retrieve ;the first byte of the DTE pair. first_run: ldx #$01 ;We load #$01 into X, which will be stored as the ;test_byte later. This way, we'll get the second ;byte of the encoded data. dec index ;We want to decrement the text index so that we ;read the same byte again (the DTE byte) the next ;time the text read routine is executed. Otherwise, ;we won't be able to read both bytes of the DTE. asl a ;We shift the DTE value to determine the pair number. jmp get_dte ;We need to get the byte now. second_run: ldx #$00 ;This way, we reset the test_byte so we don't ;cause any problems during future DTE reads. asl a ;Here, we get the pair number, and, since we want clc ;the second byte of the pair, we add one to the adc #$01 ;accumulator (the DTE index). get_dte: stx test_byte ;Store the value of the test_byte so that we either ;get the next byte on the next run or neither. tax ;We're going to use the accumulator as the index for ;the DTE table. (That's why we performed that math ;on the DTE values.) lda dte_table,x ;Finally, we retrieve the value we want from the ;table. ldx unused_byte ;Restore the contents of the X register to what it was ;prior to the dte routine. end_dte: jmp control_code ;From there, we jump back to the original code, and ;treat the decoded byte as we would any other. dte_table: .db $0D, $0F .db $0C, $0B .db $0A, $0E And there you have it folks, Dual-Tile Encoding. Simple, no? In my experience, this is the easiest assembly modifications you can do, and is the one I personally cut my teeth on. If you have any questions, contact me at redcomet@rpgclassics.com Note: Much of the above is based on source code Gideon Zhi of Aeon Genesis (http://agtp.romhack.net/) posted on the old Romhacking.com boards, so thanks go out to Gideon for, in part, making this possible. Thanks, Gid!