Microsoft accidentally exposed 38 terabytes of data from employee workstations

A mistake led to the exposure of 38 terabytes of data, including troves of private keys, passwords and internal Microsoft Teams messages from hundreds of employees, Microsoft confirmed on Monday.

The leaked data was discovered by researchers from the security firm Wiz, who notified the company on June 22.

“This issue was responsibly reported under a coordinated vulnerability disclosure and has already been addressed,” the spokesperson said. “We have confirmed that no customer data was exposed, and no other internal services were put at risk.”

The spokesperson directed Recorded Future News to a blog post on the issue that explained how a Microsoft employee shared a URL in a public GitHub repository while contributing to open-source AI learning models.

“This URL included an overly-permissive Shared Access Signature (SAS) token for an internal storage account. Security researchers at Wiz were then able to use this token to access information in the storage account,” Microsoft officials said.

“Data exposed in this storage account included backups of two former employees’ workstation profiles and internal Microsoft Teams messages of these two employees with their colleagues.”

Both Microsoft and Wiz sourced the issue back to SAS tokens, an Azure feature that allows users to share data from Azure Storage accounts.

Microsoft said the tokens provide a mechanism to restrict access to data while allowing certain clients to connect to specified Azure Storage resources.

According to Microsoft, a researcher “inadvertently included this SAS token” while contributing to the open-source AI learning models. SAS tokens should be created and managed properly, Microsoft said, adding that it is “making ongoing improvements to further harden the SAS token feature and continue to evaluate the service to bolster our secure-by-default posture.”

Once Microsoft was informed of the issue by Wiz in June, it revoked the SAS token and prevented all external access to the storage account by June 24. A subsequent investigation revealed there was no risk to customers and that there was no vulnerability exploited.

Wiz co-founder Ami Luttwak told Recorded Future News that the incident is an example of the caution that needs to be taken as companies rush to deploy AI systems.

“As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards. The latest Wiz research discovery, which is part of a broader company initiative focused on AI security, exemplifies AI challenges: This emerging technology requires large sets of data to train on,” Luttwak said.

“With many development teams needing to manipulate massive amounts of data, share it with their peers or collaborate on public open-source projects, cases like Microsoft’s are increasingly hard to monitor and avoid.”

Valid until 2051

Wiz’s blog post on the issue said researchers were able to use the exposed SAS token to access the disk backup of two employees’ workstations – which included “secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages” from 359 Microsoft employees.

The issue was reported to Microsoft on June 22 and Microsoft finished its investigation on August 16.

The blog focuses on two issues: the oversharing of data required to train AI models and the complications of using SAS tokens.

The company initially discovered the issue while scanning the internet for misconfigured storage containers. They found a GitHub repository under the Microsoft organization named “robust-models-transfer.”

The repository belongs to Microsoft’s AI research division, and its purpose is to provide open-source code and AI models for image recognition, the researchers explained. Those with access to the repository were told to download the models from an Azure Storage URL, but it was configured to grant permissions on the entire storage account, exposing additional private data by mistake.

The token, in Wiz’s view, was misconfigured to allow anyone to not only view the files in the account but delete or overwrite them.

“An attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it,” the researchers said.

“However, it’s important to note this storage account wasn’t directly exposed to the public; in fact, it was a private storage account.”

The post goes in depth on how highly-permissive non-expiring SAS tokens are a problem for organizations that have lax policies around their Azure storage systems. Revoking the tokens is not easy, Wiz noted.

“These unique pitfalls make this service an easy target for attackers looking for exposed data. Besides the risk of accidental exposure, the service’s pitfalls make it an effective tool for attackers seeking to maintain persistency on compromised storage accounts,” Wiz researchers said.

“A recent Microsoft report indicates that attackers are taking advantage of the service’s lack of monitoring capabilities in order to issue privileged SAS tokens as a backdoor. Since the issuance of the token is not documented anywhere, there is no way to know that it was issued and act against it.”

Their research found that organizations “often use tokens with a very long (sometimes infinite) lifetime, as there is no upper limit on a token's expiry.” In the situation identified on Monday, the token was valid until 2051.

There is no official way to track the tokens within Azure or monitor how they are issued, according to Wiz.

Using Account SAS tokens for external sharing is unsafe and should be avoided, the researchers said.

Get more insights with the

Recorded Future

Intelligence Cloud.

Learn more.