One of the unique capabilities of IVAAP is that it works with the cloud infrastructure of multiple vendors. Whether your SEGY file is posted on Microsoft Azure Blob Storage, Amazon S3 or Google Cloud Storage, IVAAP will be capable of visualizing it.
It’s only when administrators register new connectors that vendor-specific details need to be entered. For all other users, the user interface will be identical regardless of the data source. The REST API consumed by IVAAP’s HTML5 client is common to all connectors as well. The key component that does the hard work of “speaking the language of each cloud vendor and hiding their details to the other components” is the IVAAP Data Backend.
While the concept of “storage in the cloud” is similar across all three vendors, they each provide a different API to achieve similar goals. In this article, we will compare how to implement 4 basic functionalities. Because the IVAAP Data Backend is written in Java, we’ll only compare Java APIs.
Checking that an Object or Blob Exists
Amazon S3
String awsAccessKey = … String awsSecretKey = … String region = … String bucketName = … String keyName = … AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey); S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build(); try { HeadObjectRequest.Builder builder = HeadObjectRequest.builder().bucket(bucketName).key(keyName); s3Client.headObject(request); return true; } catch (NoSuchKeyException e) { return false; }
Microsoft Azure Blob Storage
String accountName = … String accountKey = … String containerName = … String blobName = ... StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey); String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName); BlobServiceClientBuilder builder = new BlobServiceClientBuilder().endpoint(endpoint).credential(credential); BlobServiceClient client = builder.buildClient(); BlobContainerClient containerClient = client.getBlobContainerClient(containerName); BlobClient blobClient = containerClient.getBlobClient(blobName); return blob.exists();
Google Cloud Storage
String authKey = … String projectId = … String bucketName = … String blobName = ... ObjectMapper mapper = new ObjectMapper(); JsonNode node = mapper.readTree(authKey); ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node)); GoogleCredentials credentials = GoogleCredentials.fromStream(in); Storage storage = StorageOptions.newBuilder().setCredentials(credentials) .setProjectId(projectId) .build() .getService(); Blob blob = storage.getBlob(bucketName, blobName, BlobGetOption.fields(BlobField.ID)); return blob.exists();
Getting the Last Modification Date of an Object or Blob
Amazon S3
String awsAccessKey = … String awsSecretKey = … String region = … String bucketName = … String keyName = … AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey); S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build(); HeadObjectRequest headObjectRequest = HeadObjectRequest.builder() .bucket(bucketName) .key(keyName) .build(); HeadObjectResponse headObjectResponse = s3Client.headObject(headObjectRequest); return headObjectResponse.lastModified();
Microsoft Azure Blob Storage
String accountName = … String accountKey = … String containerName = … String blobName = … StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey); String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName); BlobServiceClientBuilder builder = new BlobServiceClientBuilder() .endpoint(endpoint) .credential(credential); BlobServiceClient client = builder.buildClient(); BlobClient blob = client.getBlobClient(containerName, blobName); BlobProperties properties = blob.getProperties(); return properties.getLastModified();
Google Cloud Storage
String authKey = … String projectId = … String bucketName = … String blobName = … ObjectMapper mapper = new ObjectMapper(); JsonNode node = mapper.readTree(authKey); ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node)); GoogleCredentials credentials = GoogleCredentials.fromStream(in); Storage storage = StorageOptions.newBuilder().setCredentials(credentials) .setProjectId(projectId) .build() .getService(); Blob blob = storage.get(bucketName, blobName, BlobGetOption.fields(Storage.BlobField.UPDATED)); return blob.getUpdateTime();
Getting an Input Stream out of an Object or Blob
Amazon S3
String awsAccessKey = … String awsSecretKey = … String region = … String bucketName = … String keyName = … AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey); S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build(); GetObjectRequest getObjectRequest = GetObjectRequest.builder() .bucket(bucketName) .key(keyName) .build(); return s3Client.getObject(getObjectRequest);
Microsoft Azure Blob Storage
String accountName = … String accountKey = … String containerName = … String blobName = … StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey); String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName); BlobServiceClientBuilder builder = new BlobServiceClientBuilder() .endpoint(endpoint) .credential(credential); BlobServiceClient client = builder.buildClient(); BlobClient blob = client.getBlobClient(containerName, blobName); return blob.openInputStream();
Google Cloud Storage
String authKey = … String projectId = … String bucketName = … String blobName = … ObjectMapper mapper = new ObjectMapper(); JsonNode node = mapper.readTree(authKey); ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node)); GoogleCredentials credentials = GoogleCredentials.fromStream(in); Storage storage = StorageOptions.newBuilder().setCredentials(credentials) .setProjectId(projectId) .build() .getService(); Blob blob = storage.get(bucketName, blobName, BlobGetOption.fields(BlobField.values())); return Channels.newInputStream(blob.reader());
Listing the Objects in a Bucket or Container While Taking into Account Folder Hierarchies
S3
String awsAccessKey = … String awsSecretKey = … String region = … String bucketName = … String parentFolderPath = ... AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey); S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build(); ListObjectsV2Request.Builder builder = ListObjectsV2Request.builder().bucket(bucketName).delimiter("/").prefix(parentFolderPath + "/"); ListObjectsV2Request request = builder.build(); ListObjectsV2Iterable paginator = s3Client.listObjectsV2Paginator(request); Iterator<CommonPrefix> foldersIterator = paginator.commonPrefixes().iterator(); while (foldersIterator.hasNext()) { … }
Microsoft
String accountName = … String accountKey = … String containerName = … String parentFolderPath = ... StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey); BlobServiceClientBuilder builder = new BlobServiceClientBuilder() .endpoint(endpoint) .credential(credential); BlobServiceClient client = builder.buildClient(); BlobContainerClient containerClient = client.getBlobContainerClient(containerName); Iterable<BlobItem> iterable = containerClient.listBlobsByHierarchy(parentFolderPath + "/"); for (BlobItem currentItem : iterable) { … }
String authKey = … String projectId = … String bucketName = … String parentFolderPath = ... ObjectMapper mapper = new ObjectMapper(); JsonNode node = mapper.readTree(authKey); ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node)); GoogleCredentials credentials = GoogleCredentials.fromStream(in); Storage storage = StorageOptions.newBuilder().setCredentials(credentials) .setProjectId(projectId) .build() .getService(); Page<Blob> blobs = cloudStorage.listBlobs(bucketName, BlobListOption.prefix(parentFolderPath + "/"), BlobListOption.currentDirectory()); for (Blob currentBlob : blobs.iterateAll()) { ... }
Most developers will discover these APIs by leveraging their favorite search engine. Driven by innovation and performance, cloud APIs become obsolete quickly. Amazon was the pioneer, and much of the documentation still indexed by Google is for the v1 SDK, while the v2 has been available for more than two years, but wasn’t a complete replacement. This sometimes makes research challenging for the simplest needs. Microsoft has migrated from v8 to v12 a bit more recently and has a similar challenge to overcome. Being the most recent major player, the Google SDK is not dragged down much by obsolete articles.
The second way that developers will discover an API is by using the official documentation. I found that the Microsoft documentation is the most accessible. There is a definite feel that the Microsoft Azure documentation is treated as an important part of the product, with lots of high-quality sample code targeted at beginners.
The third way that developers discover an API is by using their IDE’s code completion. All cloud vendors make heavy use of the builder pattern. The builder pattern is a powerful way to provide options without breaking backward compatibility, but slows down the self-discovery of the API. The Amazon S3 API also stays quite close to the HTTP protocol, using terminology such as “GetRequest” and “HeadRequest”. Microsoft had a higher level API in v8 where you were manipulating blobs. The v12 iteration moved away from apparent simplicity by introducing the concept of blob clients instead. Microsoft offers a refreshing explanation of this transition. Overall, I found that the Google SDK tends to offer simpler APIs for performing simple tasks.
There are more criterias than simplicity, discoverability when comparing APIs. Versatility and performance are two of them. The Amazon S3 Java SDK is probably the most versatile because of the larger number of applications that have used its technology. It even works with S3 clones such as MinIO Object Storage (and so does IVAAP). The space where there are still a lot of changes is asynchronous APIs. Asynchronous APIs tend to offer higher scalability, faster execution, but can only be compared in specific use cases where they are actually needed. IVAAP makes heavy use of asynchronous APIs, especially to visualize seismic data. This would be the subject of another article. This is an area that evolves rapidly and would deserve a more in-depth comparison.
For more information on IVAAP, please visit int.flywheelstaging.com/products/ivaap/