jak-project/common/util/FontUtils.h
Tyler Wilding c87db7e670
i18n: subtitle code cleanup and update new subtitle JSON files to be compatible with Crowdin (#2802)
The main thing that was done here was to slightly modify the new
subtitle-v2 JSON schema to be more similar to the existing one so that
it can properly be used in Crowdin.

Draft while I double-check the diff myself

Along the way the following was also done (among other things):
- got rid of as much duplication as was feasible in the serialization
and editor code
- separated the text serialization code from the subtitle code for
better organization
- simplified "base language" in the editor. The new subtitle format has
built-in support for defining a base language so the editor doesn't have
to be used as a crutch. Also, cutscenes only defined in the base come
first in the list now as that is generally the order you'd work from
(what you havn't done first)
- got rid of the GOAL subtitle format code completely
- switched jak 2 text translations to the JSON format as well
- found a few mistakes in the jak 1 subtitle metadata files
- added a couple minor features to the editor
- consolidate and removed complexity, ie. recently all jak 1 hints were
forced to the `named` type, so I got rid of the two types as there isn't
a need anymore.
- removed subtitle editor groups for jak 1, the only reason they existed
was so when the GOAL file was manually written out they were somewhat
organized, the editor has a decent filter control, there's no need for
them.
- removed the GOAL -> JSON python script helper, it's been a month or so
and no one has come forward with existing translations that they need
help with migrating. If they do need it, the script will be in the git
history.

I did some reasonably through testing in Jak1/Jak 2 and everything
seemed to work. But more testing is always a good idea.

---------

Co-authored-by: ManDude <7569514+ManDude@users.noreply.github.com>
2023-07-09 02:53:39 +01:00

97 lines
3.2 KiB
C++

#pragma once
/*!
* @file FontUtils.h
*
* Code for handling text and strings in Jak 1's "large font" format.
*
* MAKE SURE THIS FILE IS ENCODED IN UTF-8!!! The various strings here depend on it.
* Always verify the encoding if string detection suddenly goes awry.
*/
#include <map>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <vector>
#include "common/common_types.h"
// version of the game text file's text encoding. Not real, but we need to differentiate them
// somehow, since the encoding changes.
enum class GameTextVersion {
JAK1_V1 = 10, // jak 1 (ntsc-u v1)
JAK1_V2 = 11, // jak 1 (pal+)
JAK2 = 20, // jak 2
JAK3 = 30, // jak 3
JAKX = 40 // jak x
};
extern const std::unordered_map<std::string, GameTextVersion> sTextVerEnumMap;
const std::string& get_text_version_name(GameTextVersion version);
GameTextVersion get_text_version_from_name(const std::string& name);
/*!
* What bytes a set of characters (UTF-8) correspond to. You can convert to and fro.
*/
struct EncodeInfo {
std::string chars;
std::vector<u8> bytes;
};
/*!
* Replace an unconventional string of characters with/from something more readable.
* For example, turns Ñ into N + ~ + a bunch of modifiers.
*/
struct ReplaceInfo {
std::string from;
std::string to;
};
/*!
* All the information to convert UTF-8 text into game text.
*/
class GameTextFontBank {
GameTextVersion m_version; // the version of the game text. we determine this ourselves.
std::vector<EncodeInfo>* m_encode_info;
std::vector<ReplaceInfo>* m_replace_info;
std::unordered_set<char>* m_passthrus;
const EncodeInfo* find_encode_to_utf8(const char* in) const;
const EncodeInfo* find_encode_to_game(const std::string& in, int off = 0) const;
const ReplaceInfo* find_replace_to_utf8(const std::string& in, int off = 0) const;
const ReplaceInfo* find_replace_to_game(const std::string& in, int off = 0) const;
std::string replace_to_utf8(std::string& str) const;
std::string replace_to_game(std::string& str) const;
std::string encode_utf8_to_game(std::string& str) const;
public:
GameTextFontBank(GameTextVersion version,
std::vector<EncodeInfo>* encode_info,
std::vector<ReplaceInfo>* replace_info,
std::unordered_set<char>* passthrus);
const std::vector<EncodeInfo>* encode_info() const { return m_encode_info; }
const std::vector<ReplaceInfo>* replace_info() const { return m_replace_info; }
const std::unordered_set<char>* passthrus() const { return m_passthrus; }
GameTextVersion version() const { return m_version; }
// TODO - methods would help make this code a lot better for different game versions
// hacking it for now
bool valid_char_range(const char in) const;
std::string convert_utf8_to_game(std::string str, bool escape = false) const;
std::string convert_game_to_utf8(const char* in) const;
};
extern GameTextFontBank g_font_bank_jak1_v1;
extern GameTextFontBank g_font_bank_jak1_v2;
extern GameTextFontBank g_font_bank_jak2;
extern std::map<GameTextVersion, GameTextFontBank*> g_font_banks;
const GameTextFontBank* get_font_bank(GameTextVersion version);
const GameTextFontBank* get_font_bank(const std::string& name);
bool font_bank_exists(GameTextVersion version);