Reinforcement Learning in Buchberger's Algorithm
Buchberger’s algorithm is the classical algorithm for computing a Gröbner basis, and optimized implementations are crucial for many computer algebra systems. In this thesis we introduce a new approach to Buchberger’s algorithm that uses deep reinforcement learning agents to perform S-pair selection, a key choice in the algorithm. We first study how the difficulty of the problem depends on several random distributions of polynomial ideals, about which little is known. Next, we train a policy model using proximal policy optimization to learn S-pair selection strategies for random systems of binomial equations. In certain domains the trained model outperforms state-of-the-art selection heuristics in total number of polynomial additions performed, which provides a proof-of-concept that recent developments in machine learning have the potential to improve performance of critical algorithms in symbolic computation. Finally, we apply these techniques to Bigatti’s algorithm for computing a Hilbert series, and consider extending these results with neural network value models and tree search.