jak-project/common/util/trie_with_duplicates.h
Tyler Wilding d1ece445d4
Dependency graph work - Part 1 - Preliminary work (#3505)
Relates to #1353 

This adds no new functionality or overhead to the compiler, yet. This is
the preliminary work that has:
- added code to the compiler in several spots to flag when something is
used without being properly required/imported/whatever (disabled by
default)
- that was used to generate project wide file dependencies (some
circulars were manually fixed)
- then that graph underwent a transitive reduction and the result was
written to all `jak1` source files.

The next step will be making this actually produce and use a dependency
graph. Some of the reasons why I'm working on this:
- eliminates more `game.gp` boilerplate. This includes the `.gd` files
to some extent (`*-ag` files and `tpage` files will still need to be
handled) this is the point of the new `bundles` form. This should make
it even easier to add a new file into the source tree.
- a build order that is actually informed from something real and
compiler warnings that tell you when you are using something that won't
be available at build time.
- narrows the search space for doing LSP actions -- like searching for
references. Since it would be way too much work to store in the compiler
every location where every symbol/function/etc is used, I have to do
ad-hoc searches. By having a dependency graph i can significantly reduce
that search space.
- opens the doors for common shared code with a legitimate pattern.
Right now jak 2 shares code from the jak 1 folder. This is basically a
hack -- but by having an explicit require syntax, it would be possible
to reference arbitrary file paths, such as a `common` folder.

Some stats:
- Jak 1 has about 2500 edges between files, including transitives
- With transitives reduced at the source code level, each file seems to
have a modest amount of explicit requirements.

Known issues:
- Tracking the location for where `defmacro`s and virtual state
definitions were defined (and therefore the file) is still problematic.
Because those forms are in a macro environment, the reader does not
track them. I'm wondering if a workaround could be to search the
reader's text_db by not just the `goos::Object` but by the text
position. But for the purposes of finishing this work, I just statically
analyzed and searched the code with throwaway python code.
2024-05-12 12:37:59 -04:00

147 lines
4.3 KiB
C++

#pragma once
#include <array>
#include <memory>
#include <string>
#include <vector>
// A normal Trie does not allow for duplicate keys, however this one does
// It allows for insertion and removal
template <typename T>
class TrieWithDuplicates {
private:
struct TrieNode {
std::array<std::unique_ptr<TrieNode>, 256> children;
std::vector<std::unique_ptr<T>> elements;
};
std::unique_ptr<TrieNode> root = std::make_unique<TrieNode>();
public:
TrieWithDuplicates() {}
T* insert(const std::string& key, const T& element) {
std::unique_ptr<T> new_element = std::make_unique<T>(element);
TrieNode* curr_node = root.get();
for (const char character : key) {
auto& child = curr_node->children[(uint8_t)character];
if (!child) {
child = std::make_unique<TrieNode>();
}
curr_node = child.get();
}
curr_node->elements.push_back(std::move(new_element));
return curr_node->elements.back().get();
}
std::vector<T*> retrieve_with_prefix(const std::string& prefix, int max_count = -1) const {
std::vector<T*> results;
TrieNode* curr_node = root.get();
for (const char character : prefix) {
if (max_count >= 0 && (int)results.size() > max_count) {
return results;
}
const auto& child = curr_node->children.at((uint8_t)character);
if (child == nullptr) {
return results; // tree ends, nothing found with that prefix
} else {
curr_node = child.get();
}
}
retrieve_elements(curr_node, results, max_count);
return results;
}
std::vector<T*> retrieve_with_exact(const std::string& key) const {
std::vector<T*> results;
TrieNode* curr_node = root.get();
for (const char character : key) {
const auto& child = curr_node->children.at((uint8_t)character);
if (child == nullptr) {
return results; // tree ends, nothing found with that key
} else {
curr_node = child.get();
}
}
for (const auto& element : curr_node->elements) {
results.push_back(element.get());
}
return results;
}
bool remove(const std::string& key, const T* to_be_removed) {
TrieNode* curr_node = root.get();
for (const char character : key) {
const auto& child = curr_node->children.at((uint8_t)character);
if (child == nullptr) {
return false; // tree ends, nothing found with that key
} else {
curr_node = child.get();
}
}
// Since the trie holds duplicates, we can't delete on the key alone
// now search to see which element is identical
auto it = curr_node->elements.begin();
while (it != curr_node->elements.end()) {
if (it->get() == to_be_removed) {
it = curr_node->elements.erase(it);
return true; // we can assume that the same ptr isn't stored twice.
} else {
++it;
}
}
return false;
}
// Return the total number of elements stored in the TrieMap
int size() const {
int count = 0;
count_elements(root.get(), count);
return count;
}
std::vector<T*> get_all_elements() const {
std::vector<T*> results;
get_all_elements_helper(root.get(), results);
return results;
}
private:
void retrieve_elements(const TrieNode* node, std::vector<T*>& results, int max_count = -1) const {
for (const auto& element : node->elements) {
if (max_count >= 0 && (int)results.size() > max_count) {
return;
}
results.push_back(element.get());
}
for (const auto& child : node->children) {
if (max_count >= 0 && (int)results.size() > max_count) {
return;
}
if (child.get() != nullptr) {
retrieve_elements(child.get(), results, max_count);
}
}
}
void count_elements(const TrieNode* node, int& count) const {
count += node->elements.size();
for (const auto& child : node->children) {
if (child.get() != nullptr) {
count_elements(child.get(), count);
}
}
}
void get_all_elements_helper(const TrieNode* node, std::vector<T*>& result) const {
for (const auto& element : node->elements) {
result.push_back(element.get());
}
for (const auto& child : node->children) {
if (child.get() != nullptr) {
get_all_elements_helper(child.get(), result);
}
}
}
};