jak-project/common/util/Trie.h

190 lines
4.4 KiB
C
Raw Normal View History

#pragma once
#include <string>
#include <vector>
#include "common/util/Assert.h"
/*!
* A simple prefix tree. It works similarly to a map, but also supports fast lookups by prefix with
* the ability to lookup all entries with keys that begin with a given prefix.
*
* It owns the memory for the objects it stores.
* Doing an insert will create a copy of your object.
*
LSP: A bunch of new OpenGOAL language features (#3437) - Integrate the AST into the LSP, this makes parsing and tokenizing the files much easier - Consolidate most of the symbol info tracking in `goalc` to a single map. Fixed some issues where the old map would never evict symbols when re-compiling files. There is still some more to cleanup, but this now can be used as an incrementally updated source-of-truth for the LSP - re-compile files when they are saved. Ideally this would be done everytime they are changed but that: - may be too aggressive - goalc doesn't compile incrementally yet so it likely would be a worse UX Features added, see https://github.com/open-goal/opengoal-vscode/issues/256 - Hover ![image](https://github.com/open-goal/jak-project/assets/13153231/58dadb5d-582c-4c1f-9ffe-eaa4c85a0255) ![image](https://github.com/open-goal/jak-project/assets/13153231/b383adde-57fc-462c-a256-b2de5c30ca9a) - LSP Status fixed - Type Hierarchy ![image](https://github.com/open-goal/jak-project/assets/13153231/8e681377-1d4e-4336-ad70-1695a4607340) - Document Color ![image](https://github.com/open-goal/jak-project/assets/13153231/4e48ccd8-0ed1-4459-a133-5277561e4201) - Document Symbols ![Screenshot 2024-03-27 004105](https://github.com/open-goal/jak-project/assets/13153231/8e655034-43c4-4261-b6e0-85de00cbfc7f) - Completions ![Screenshot 2024-03-30 004504](https://github.com/open-goal/jak-project/assets/13153231/d123a187-af90-466b-9eb7-561b2ee97cd1) --------- Co-authored-by: Hat Kid <6624576+Hat-Kid@users.noreply.github.com>
2024-03-30 19:49:07 -04:00
* Other than deleting the whole thing, there is no support for removing a node.
*/
template <typename T>
class Trie {
public:
Trie() = default;
Trie(const Trie&) = delete;
Trie& operator=(const Trie&) = delete;
// Insert an object, replacing an existing one if it exists
void insert(const std::string& str, const T& obj);
// Get the object at the string. Default construct a new one if none exists.
T* operator[](const std::string& str);
// Lookup an existing object. If none exists, return nullptr.
T* lookup(const std::string& str) const;
// return the number of entries.
int size() const { return m_size; }
// Get all objects starting with the given prefix.
std::vector<T*> lookup_prefix(const std::string& str) const;
docs: Automatically generate documentation from goal_src code (#2214) This automatically generates documentation from goal_src docstrings, think doxygen/java-docs/rust docs/etc. It mostly supports everything already, but here are the following things that aren't yet complete: - file descriptions - high-level documentation to go along with this (think pure markdown docs describing overall systems that would be co-located in goal_src for organizational purposes) - enums - states - std-lib functions (all have empty strings right now for docs anyway) The job of the new `gen-docs` function is solely to generate a bunch of JSON data which should give you everything you need to generate some decent documentation (outputting markdown/html/pdf/etc). It is not it's responsibility to do that nice formatting -- this is by design to intentionally delegate that responsibility elsewhere. Side-note, this is about 12-15MB of minified json for jak 2 so far :) In our normal "goal_src has changed" action -- we will generate this data, and the website can download it -- use the information to generate the documentation at build time -- and it will be included in the site. Likewise, if we wanted to include docs along with releases for offline viewing, we could do so in a similar fashion (just write a formatting script to generate said documentation). Lastly this work somewhat paves the way for doing more interesting things in the LSP like: - whats the docstring for this symbol? - autocompleting function arguments - type checking function arguments - where is this symbol defined? - etc Fixes #2215
2023-02-20 19:49:37 -05:00
// Get all nodes in the tree.
std::vector<T*> get_all_nodes() const;
~Trie();
private:
static constexpr int CHAR_SIZE = 256;
static int idx(char c) { return (u8)c; }
struct Node {
T* value = nullptr;
Node* children[CHAR_SIZE] = {0};
void delete_children() {
if (value) {
delete value;
value = nullptr;
}
for (auto child : children) {
if (child) {
child->delete_children();
delete child;
child = nullptr;
}
}
}
/*!
* Return true if a new object was inserted.
*/
bool insert(const char* str, const T* obj) {
if (!*str) {
// we are the child!
if (value) {
delete value;
value = new T(*obj);
return false; // didn't change the count.
} else {
value = new T(*obj);
return true; // did change the count
}
} else {
// still more to go
char first = *str;
if (!children[idx(first)]) {
children[idx(first)] = new Node();
}
return children[idx(first)]->insert(str + 1, obj);
}
}
T* bracket_operator(const char* str, bool* inserted) {
if (!*str) {
// we are the child!
if (!value) {
value = new T();
*inserted = true;
} else {
*inserted = false;
}
return value;
} else {
// still more to go
char first = *str;
if (!children[idx(first)]) {
children[idx(first)] = new Node();
}
return children[idx(first)]->bracket_operator(str + 1, inserted);
}
}
T* lookup(const char* str) const {
if (!*str) {
return value;
}
if (children[idx(*str)]) {
return children[idx(*str)]->lookup(str + 1);
}
return nullptr;
}
void get_all_children(std::vector<T*>& result) const {
if (value) {
result.push_back(value);
}
for (auto child : children) {
if (child) {
child->get_all_children(result);
}
}
}
std::vector<T*> lookup_prefix(const char* str) const {
if (!*str) {
std::vector<T*> result;
get_all_children(result);
return result;
} else {
if (children[idx(*str)]) {
return children[idx(*str)]->lookup_prefix(str + 1);
} else {
return {};
}
}
}
};
Node m_root;
int m_size = 0;
};
template <typename T>
Trie<T>::~Trie() {
m_root.delete_children();
m_size = 0;
}
template <typename T>
void Trie<T>::insert(const std::string& str, const T& obj) {
if (m_root.insert(str.c_str(), &obj)) {
m_size++;
}
}
template <typename T>
T* Trie<T>::lookup(const std::string& str) const {
return m_root.lookup(str.c_str());
}
template <typename T>
T* Trie<T>::operator[](const std::string& str) {
bool added = false;
auto result = m_root.bracket_operator(str.c_str(), &added);
if (added) {
m_size++;
}
return result;
}
template <typename T>
std::vector<T*> Trie<T>::lookup_prefix(const std::string& str) const {
return m_root.lookup_prefix(str.c_str());
}
docs: Automatically generate documentation from goal_src code (#2214) This automatically generates documentation from goal_src docstrings, think doxygen/java-docs/rust docs/etc. It mostly supports everything already, but here are the following things that aren't yet complete: - file descriptions - high-level documentation to go along with this (think pure markdown docs describing overall systems that would be co-located in goal_src for organizational purposes) - enums - states - std-lib functions (all have empty strings right now for docs anyway) The job of the new `gen-docs` function is solely to generate a bunch of JSON data which should give you everything you need to generate some decent documentation (outputting markdown/html/pdf/etc). It is not it's responsibility to do that nice formatting -- this is by design to intentionally delegate that responsibility elsewhere. Side-note, this is about 12-15MB of minified json for jak 2 so far :) In our normal "goal_src has changed" action -- we will generate this data, and the website can download it -- use the information to generate the documentation at build time -- and it will be included in the site. Likewise, if we wanted to include docs along with releases for offline viewing, we could do so in a similar fashion (just write a formatting script to generate said documentation). Lastly this work somewhat paves the way for doing more interesting things in the LSP like: - whats the docstring for this symbol? - autocompleting function arguments - type checking function arguments - where is this symbol defined? - etc Fixes #2215
2023-02-20 19:49:37 -05:00
template <typename T>
std::vector<T*> Trie<T>::get_all_nodes() const {
std::vector<T*> result;
m_root.get_all_children(result);
return result;
}