Hedvig object storage: Workflow and usage

By Shufan (Lucy) Ge | | Object Storage

Object storage is the preferred storage protocol when it comes to storing static and unstructured data. It is also synonymous with cloud storage, as most cloud storage providers leverage object-storage architecture. Object storage is suitable for multiple use cases, such as backup and archiving, big data analytics, and document storage solutions. The minimum set of expectations that you should have from any object storage provider includes low-cost storage, scalability, cloud integration, and support for the widely used S3 protocol.

Hedvig Object Storage speaks two of the most common ‘languages’ in object storage protocols, S3 and OpenStack Swift. Hedvig provides native object storage implementation and does not rely on any existing distributed file system or storage protocol. Hedvig Object Storage also provides all of the enterprise storage features that come standard with block and NFS implementations, such as compression, deduplication, encryption, etc.

Here are a few of the capabilities provided by Hedvig Object Storage:

  • reading (parts and whole), writing, and listing of object data, metadata, and object versions in a bucket
  • deletion of multiple objects in one request, reducing per request overhead
  • hierarchical object namespaces (pseudo folder structures, similar to that of file systems)
  • object versioning
  • access over HTTP and HTTPS
  • large objects through multipart uploads
  • S3 version 2 and version 4 authentication
  • ACLs (access control lists) for bucket and object granularity
  • bucket lifecycle rules, which enable automatic expiration of objects, based on predetermined rules associated with a bucket
OBJECT STORAGE WORKFLOW

The Hedvig Storage Proxy (HSP) is responsible for translating S3 and Swift protocols and also communicates with the Hedvig storage cluster for all object storage operations. The HSP is a flexible component that can be deployed in multiple ways to accommodate user application requirements. For instance, it can be deployed as a VM or container, or even inside a standalone physical server. The HSP can act as an AWS S3 server or OpenStack SWIFT server. You can also have one or more HSPs:

  1. Single Hedvig Storage Proxy: There is only one HSP, and all of the S3 clients talk to the Hedvig storage cluster using this HSP.
  2. Multiple Hedvig Storage Proxies: Multiple HSPs are deployed, and they are fronted by a load balancer (for example, nginx). All S3 clients talk to this load balancer, and it takes care of distributing the workload across these HSPs (Figure 1).

Figure 1: Hedvig Object Storage workflow

There are multiple ways of interacting with Hedvig Object Storage: AWS S3 CLI, S3 browser, Cyberduck, AWS SDK, etc.

Here is how you can utilize the AWS S3 CLI:
aws S3api –endpoint <Hedvig_endpoint> –profile <or default> <rest of the S3 command>

The <Hedvig_endpoint> for object storage is identified by:
http://<hostname>:<port_number>
or
https://<hostname>:<port_number>

<hostname> – the hostname of the Hedvig Storage Proxy that you have configured for object storage or the hostname of the load balancer with which you may be fronting the storage proxies.

<port_number> – the port number specified in Hedvig’s configuration file or the port on which your HTTP load balancer is running in your environment.

Note: The profile parameter corresponds to the user profiles configured in the AWS credentials file.

SPECIFYING HEDVIG ATTRIBUTES WHEN CREATING A BUCKET

Hedvig-specific attributes can be applied during bucket creation as:

aws S3api create-bucket —endpoint <Hedvig_endpoint> -bucket <bucketName> —create-bucket-configuration “LocationConstraint=<blkSize:replicationFactor: ReplicationPolicy:policyDetails: versioningStatus:bucketSize>”

blkSize: 4 / 64 (in kB) 

replicationFactor: 1 to 6 

replicationPolicy: Agnostic / DataCenterAware / RackAware 

versioningStatus: unversioned / versioningenabled 

bucketSize: <Numeric value> (in GB) 


For example, here is a command to create a ‘versioned’ bucket with a ‘data center aware’ replication policy, replication factor of ‘3’, block size of ‘4’, and bucket size of ‘1 TB’:

aws S3api create-bucket —endpoint <Hedvig_endpoint> -bucket <bucketName> —create-bucket-configuration “LocationConstraint=4:3:DataCenterAware:snc1\,snc2\,snc3: versioningenabled:1024” 


OTHER WAYS TO ACCESS HEDVIG OBJECT STORAGE
  1. Cyberduck (GUI for Mac)
Cyberduck (GUI for Mac)
  1. S3 browser (GUI for windows)
S3 browser (GUI for windows)
  1. Programmable API development kit
  2. Sample code to fetch S3 client to connect to HSP S3 service with JAVA SDK:
Programmable API development kit

JAVA SDK can be referenced here.

Hedvig’s native capability of deploying its distributed storage platform across multiple data centers and clouds facilitates your object storage infrastructure to cross geographical boundaries. This allows you to store objects from anywhere and to retrieve them from anywhere, enabling private cloud, hybrid-cloud, and multi-cloud deployments. You can read more about different cloud deployments using Hedvig Object Storage here.

Want to learn more? Schedule a live demo today.

Schedule Today