Virginia Tech mathematicians target data center inefficiency
The manic pace of sharing, storing, securing, and serving data has a manic price – power consumption. To counter this, Virginia Tech mathematicians are leveraging algebraic geometry to target the inefficiencies of data centers.
“We as individuals generate tons of data all the time, not to mention what large companies are producing,” said Gretchen Matthews, mathematics professor and director of the Southwest Virginia node of the Commonwealth Cyber Initiative. “Backing up that data can mean replicating and storing twice or three times as much information if we don’t consider smart alternatives.”
Instead of energy-intensive data replication, Matthews and Hiram Lopez, assistant professor of mathematics, explored using certain algebraic structures to break the information into pieces and spread it out among servers in close proximity to each other. When one server goes down, the algorithm can poll the neighboring servers until it recovers the missing data.
Mathematicians have known since the 1960s that polynomials can be used to store information. But in the last decade, researchers discovered how to build special polynomials that can store data in convenient configurations for applications like recovering missing information locally.
“Turns out there’s some beautiful mathematical structures that were developed over the years that can provide a better way to store data and serve additional requests,” Matthews said.
Their presentation of a new method for storing and serving data was featured in an invited review article in the December edition of IEEE BITS magazine.
Matthews and Lopez’s work comes at a time when demand for electricity is surging across the nation: Grid planners forecast peak demand to increase by 38 gigawatts through 2028. The anticipated increase is largely due to the construction of new data centers, many of which are under construction or being planned in Virginia.
Besides targeting inefficiencies in how server farms store data, the method also addresses energy use associated with how data center algorithms search for requested information.
“All of these structures are tied to the physical world, and they are subject to space and time,” said Lopez. “It takes energy to find and retrieve information.”
If too many people try to access the same information at the same time, the system will fail and result in what’s colloquially referred to as “breaking the internet.” When a selfie or a video goes viral, every request to see or share the content pings some of the servers that store the backups. At some point, there are no copies available to be viewed — and the server crashes.
Matthews and Lopez’s technique, which uses an error-correcting code, improves data access and storage in two main ways:
- Servers don’t have to store complete duplicates of any information, which means they now have more storage space.
- During server failure or data erasure, the algorithm doesn’t have to expend energy searching the entire network to recover the missing information — it simply needs to see what the neighboring servers have in store.
In a subsequent research project and publication for Designs, Codes and Cryptography, Matthews and collaborators noted that the underlying structure of a particularly famous class of error-correcting codes, called Reed Muller codes, allows for the recovery of missing information naturally.
This type of application demonstrates how deep mathematics can be relevant and impactful to issues that our society is facing, not just here in the commonwealth, but as a nation and as a global community, Matthews said.
“Improving the systems and processes we already have in place can help us meet our goals for sustainable growth,” Matthews said.