Replace GraphQL extractor for BulkImports::Projects::Pipelines::RepositoryPipeline with file-based extractor
Situation
Currently, the BulkImport uses uses GraphQL, REST API and exported files to extract the data from the source location. For example, to get the attributes of a group, a GraphQL call is made to the source location requesting the attributes' values that need to be migrated for the group. And to extract the boards of a group, an exported file with the relevant boards' data is used.
Problem
Recently, we noticed that the use of GraphQL or REST aren't optimal as they increase the maintainability and the complexity for different reasons, such as:
- BulkImport needs to keep track of the breaking changes in the GraphQL API to support backward compatibility, and the target instance needs to consider the GitLab version running the source instance to build the GraphQL queries.
- BulkImport needs to consider the GitLab Edition of the source instance to build the GraphQL queries.
- APIs don't return the data in an optimized format to be imported, which results in more network requests or more transformation of the data on the target side.
- API needs to expose more information than is needed for migration purposes.
Another problem with using network requests to extract the data is that the BulkImport doesn't support migrating data from source instances that the target instance can't reach via the network (air-gapped) as the pipelines using GraphQL and REST aren't be able to communicate with the source location.
Solution
Because of the identified problems, it's desirable to refactor the pipelines using network requests to extract the data to use exported files.
This issue is for replacing GraphQL extractor for BulkImports::Projects::Pipelines::RepositoryPipeline with file-based extractor.