GraphQL Vulnerabilities - Applied Review

What is GraphQL?
#

GraphQL is a query language designed to provide efficient communication between clients and servers by having the client specify exactly what data they want in the response. This helps avoid overly large responses that you might see with other queries.

This works by having a GraphQL server that gets the requests from clients and fetches data from the relevant locations, this way the client does not need to know where the data resides. GraphQL is pretty versatile and can be implemented with a variety of programming languages and can communicate with most data stores.

Under the Hood?
#

GraphQL uses a schema to define the structure of the service’s data. this schema lists the available objects, fields, and relationships in the data store.

This data can be manipulated using queries to fetch the data, mutations to change the data, or subscriptions which are similar to queries but have some slight differences.

To make things easier for the client, all GraphQL services use the same endpoint, and operations are typically sent as a POST request by the client. GraphQL services often responds with a JSON object in the structure that the client requests.

Schema
#

The schema defines data as a series of types (objects) which can be implemented by a service. For example, imagine a website that lets you query information about movies - the schema might look like this:

type Movie {
  title: String
  director: Director
}

type Director {
  name: String
  movies: [Movie]
}

In this example, we defined a collection of object types (movie and director) where a movie can have an associated director and the director can have a list of movies.

Queries
#

If you want to retrieve data from a store, you’d use a query. These queries usually consist of an operation type, a name, a data structure, and more arguments if needed.

For example, if our schema defined the following query type:

type Query {
  movies: [Movie]
  directors: [Director]
}

The query type defines the movies and directors fields, which will return a list of the corresponding type when called.

We might make a query like this:

query GetMoviesAndDirectors {
  movies {
    title
  }

  directors {
    name
  }
}

This requests a list of movie titles and a list of directors separately. The server might respond with something like this:

{
  "data": {
    "movies": [
      {
        "title": "2001: A Space Odyssey"
      }
    ],
    "directors": [
      {
        "name": "Stanley Kubrick"
      }
    ]
  }
}

This example assumes that the data store only contains one movie and director, but if more were present those would be returned in the query.

Mutations
#

Mutations are how we change data by either modifying it, adding it, or deleting it. These also need a type, name, and structure for the returned data. For example, you might define the addMovie mutation in your schema like this:

type Mutation {
  addMovie(title: String, director: String): Movie
}

And you could add a movie by sending the following mutation:

mutation CreateMovie {
  addMovie(title: "Terminator 2", director: "James Cameron") {
	title
    director {
      name
    }
  }
}

And our server might respond with this:

{
  "data": {
    "addMovie": {
      "title": "Terminator 2",
      "director": {
        "name": "James Cameron"
      }
    }
  }
}

Pretty straightforward so far.

Fields and Arguments
#

All of the types we have made contain query-able items called fields, which are just the pieces of data we want the API to return in a given query or mutation.

In our previous examples, the two fields of the Movie type are title and director, where each field returns the data specified even if that is in another type.

You can add arguments to narrow down a search by a specific field, for example if we had a query like this:

query getDirectorQuery{
  getDirector(title:"2001: A Space Odyssey") {
	  director {
		  name
	  }
  }
}

This query using these arguments would return the following:

{
	"data": {
		"getDirector": [
		{
			"director" {
				"name": "Stanley Kubrick"
			}
		}
	    ]
	}
}

Aliases
#

If you make a query and want it to return multiple of the same properties, then you would want to use aliasing to prevent syntax issues. for example:

query getMovieDetails{
	movie1: getMovie(director: "James Cameron"){
		movies
	}
	movie2: getMovie(director: "Stanley Kubrick"){
		movies
	}
}

This would return two pieces of data labeled movie1 and movie2 respectively, containing the movies associated with each director.

Introspection
#

This is a built-in GraphQL function that asks the server for schema information. This can present a pretty serious information disclosure risk, as it gives a potential attacker insight on how to interact with the API to get potentially sensitive information.

Finding GraphQL Endpoints
#

Of course, in order to exploit vulnerabilities with GraphQL we need to verify its presence/use on the target application. You can usually do this by fuzzing endpoints/directories on the target application if there is not inherent functionality that reveals the endpoint in use.

Exploiting Unsanitized Arguments
#

Once we have found an endpoint, we can begin to test the query arguments. We might be able to access objects directly using the GraphQL implementation on the target application.

For example, imagine the target web application implements the following query to fetch products from a list:

    query {
        products {
            id
            name
            listed
        }
    }

If we send a request that uses this query, we might see the following response:

    {
        "data": {
            "products": [
                {
                    "id": 1,
                    "name": "Product 1",
                    "listed": true
                },
                {
                    "id": 2,
                    "name": "Product 2",
                    "listed": true
                },
                {
                    "id": 4,
                    "name": "Product 4",
                    "listed": true
                }
            ]
        }
    }

You might notice that the product with the ID of three is missing, and we might be able to access it by querying for that specific product like so:

    query {
        product(id: 3) {
            id
            name
            listed
        }
    }

By querying this specific product, we might be able to bypass the way the application developers attempted to obscure it.

Understanding the Schema
#

We always want to gain more insight into the schema because it will give us an idea on where to look for potentially sensitive information. It also gives us insight about how to interact with the GraphQL API in a way that makes this process easier.

Using Introspection
#

If we query the __schema field, we will often get back more of the schema information, so long as we are able to access the GraphQL endpoint.

One common introspection query can be found here. We will mostly be using Burp Suite’s tools to view and modify the GraphQL queries as they are sent.

This introspection query will help you map out the database and give you an idea on what to query next.

Visualizing the Data
#

If a schema is especially large or complicated, it helps to visualize the data with a tool like this. It also help you understand what you can and can’t access depending on the implementation.

For example, imagine a web application where we can view blog posts and upon viewing a few posts we see that GraphQL is being used to fetch post information. We can send an introspection query and in this case, we find some interesting results about authentication:

{
"name": "getUser",
    "description": null,
    "args": [
        {
	       "name": "id",
			"description": null,
			...

We can then construct a query to get this user’s information like so:

{
"query":
"query($id: Int!) { 
	getUser(id: $id) { 
		id, 
		username, 
		password 
	} 
}",
"variables":{
	"id":1
	}
}

This would return the username and password of the users specified by their ID number.

Common Introspection Defenses and Bypasses
#

Sometimes developers try to block introspection queries by using regex to exclude the __schema keyword in queries. If we try to use special characters after the keyword, it should be ignored by GraphQL but not some regex implementation.

For example, imagine a developer is excluding __schema{ only, we could use a query like this to get schema information:

{
"query":
	"query{__schema{queryType{name}}}"
}

If this still doesn’t work, then try using an alternative request method, like switching from GET to POST. You can also try to URL-encode the request body or parameters for better results.

Using Aliases to Bypass Rate Limits
#

Typically, GraphQL object can’t contain more than one property with the same name. If you use aliases though, you can get multiple instances of the same object to return in one request.

Aliases can also be used to brute-force a GraphQL endpoint because some rate-limiting technologies just limit the number of requests instead of the size of the request. If we use aliases, we can get the results of multiple queries in one HTTP request.

GraphQL CSRF
#

If the GraphQL endpoint does not validate which kinds of content are being sent to it, and it doesn’t use CSRF protections we could potentially request some action on behalf of another user. These steps are the same as normal CSRF attacks for the most part.

HTB Passman (Cyber Apocalypse 2023)
#

This challenge begins by presenting us with a login page:

We can make an account and log in, the web application seems to be a password manager of some kind, letting us save notes like this:

When we load this page, the following POST request is made:

We can run an introspection query and get back the schema information that allows us to better understand our original query and gives us the knowledge we need to access another user’s vault.

For example, here is our login query:

{"query":"mutation($username: String!, $password: String!) { LoginUser(username: $username, password: $password) { message, token } }","variables":{"username":"gabe","password":"password"}}

This login request tells us that there is a mutation called LoginUser that takes username and password inputs and will return with a message and a token.

If we inspect the result from the introspection query and view the mutations, we can see the following:

Now that we know about the UpdatePassword mutation, we might be able to change another user’s password.

If you read through the source files with this challenge, you can find usernames if I remember correctly. Otherwise the administrator username is pretty easy to guess and isn’t really relevant because we can just change the password. I mention this because we can’t enumerate users using the login function alone.

We can construct our mutation and try it out:

{"query":"mutation($username: String!, $password: String!) { UpdatePassword(username: $username, password: $password) { message } }","variables":{"username":"admin","password":"pwned"}}

Initially, I got an error about not having an authenticated session, but if I just copy over my old session cookie it works just fine:

Then, we can just go ahead and log in as that admin user:

Once logged in we can read the completion flag for this challenge.

Prevention
#

The main three things we want to prevent are unsafe introspection, brute-forcing, and CSRF.

To prevent unsafe introspection you can:

Disable introspection in the case of a private API, if you are using a public API then you would want to audit the schema to see if any sensitive fields are present.
Make sure to disable suggestions that might help an attacker fix a buggy query.

To prevent brute-forcing attacks you would want to:

Limit the query depth so that heavily nested queries will not run.
Configure operation limitations like the maximum number of bytes a query can contain or try implementing some cost analysis tools.

To prevent CSRF attacks over GraphQL, the methodology is similar for CSRF in general:

Make sure your API only accepts queries over JSON-encoded POST requests.
Make sure the API validates that the provided data is the supplied content type.
Use a secure CSRF token mechanism.

Insecure Deserialization - Applied Review

1 January 2024·9 mins

web BSCP

What is Serialization? # As the name suggests, serialization is the process of converting complex data into a simpler format that can be send as a stream of bytes.

DOM-Based Vulnerabilities - Applied Review

20 December 2023·10 mins

web BSCP

What is the DOM? # The document object model is a web browser’s representation of the elements on the page.

WebSockets - Applied Review

20 December 2023·5 mins

web BSCP

What is a Web Socket? # WS (WebSockets) are widely used in modern web applications because they can initiate long-lived sessions over HTTP with asynchronous communication in both directions.