Go CDK is an abbreviation of The Go Cloud Development Kit, which is a project to handle services with almost the same functions provided by major cloud vendors with a unified API (formerly known as Go Cloud).
For example, the process of saving / retrieving an object in a cloud storage service can be written as follows by using Go CDK [^ 1].
[^ 1]: The gocloud.dev
used in the sample code in this article is v0.20.0.
package main
import (
"context"
"fmt"
"log"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
//Save
if err := bucket.WriteAll(ctx, "sample.txt", []byte("Hello, world!"), nil); err != nil {
log.Fatal(err)
}
//Get
data, err := bucket.ReadAll(ctx, "sample.txt")
if err != nil {
log.Fatal(err)
}
fmt.Println(string(data))
}
package main
import (
"context"
"fmt"
"log"
"gocloud.dev/blob"
_ "gocloud.dev/blob/gcsblob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "gs://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
//Save
if err := bucket.WriteAll(ctx, "sample.txt", []byte("Hello, world!"), nil); err != nil {
log.Fatal(err)
}
//Get
data, err := bucket.ReadAll(ctx, "sample.txt")
if err != nil {
log.Fatal(err)
}
fmt.Println(string(data))
}
package main
import (
"context"
"fmt"
"log"
"gocloud.dev/blob"
_ "gocloud.dev/blob/azureblob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "azblob://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
//Save
if err := bucket.WriteAll(ctx, "sample.txt", []byte("Hello, world!"), nil); err != nil {
log.Fatal(err)
}
//Get
data, err := bucket.ReadAll(ctx, "sample.txt")
if err != nil {
log.Fatal(err)
}
fmt.Println(string(data))
}
The code differences when using different cloud vendors are in the import part of the driver and blob.OpenBucket ()
. Only the scheme for the URL you are giving. It is wonderful!
In this way, Go CDK makes it easy to implement multi-cloud applications and highly cloud-portable applications.
If you would like to know more about Go CDK, please refer to the official information.
It is a very convenient Go CDK, but the project status as of October 2020 seems to be "API is alpha but production-ready" [^ 2]. Please be at your own risk when introducing it.
This is the main subject.
I'm sure there are people who think, "I only use AWS! Vendor lock-in is good!" In this article, I would like to introduce the advantages of using Go CDK even for such people, using "S3 object operations (save / retrieve)" as an example.
The Go CDK API is designed to be intuitive, easy to understand, and easy to handle.
Reading and writing objects to cloud storage with Go CDK is available at blob.Bucket
NewReader ()
and NewWriter ()
Blob.Reader
([[email protected]/blob#Bucket.NewWriter) ʻO.Reader](implemented https://pkg.go.dev/io#Reader)) and [
blob.Writer](https://pkg.go.dev/[email protected]) Use / blob # Writer) (implementing [ʻio.Writer
).
It is very intuitive that you can get (read) an object with blob.Reader
(ʻio.Reader) and save (write) it with ʻio.Writer
(ʻio.Writer`). This allows you to work with objects in the cloud as if you were working with local files.
Let's take a look at how it will be easier to understand than using the AWS SDK, with concrete examples.
For the AWS SDK, [ʻUpload () in [
s3manager.Uploader](https://pkg.go.dev/github.com/aws/aws-sdk-go/service/s3/s3manager#Uploader) ](Https://pkg.go.dev/github.com/aws/aws-sdk-go/service/s3/s3manager#Uploader.Upload) will be used. Pass the contents of the object to be uploaded to the method as ʻio.Reader
. When uploading a local file, it is convenient to pass ʻos.File` as it is, but the trouble is that the data in memory is in some form. If you want to encode with and save as it is.
For example, the process of JSON-encoding and S3 as it is is as follows in the AWS SDK.
package main
import (
"encoding/json"
"io"
"log"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
func main() {
sess, err := session.NewSession(&aws.Config{
Region: aws.String("ap-northeast-1"),
})
if err != nil {
log.Fatal(err)
}
uploader := s3manager.NewUploader(sess)
data := struct {
Key1 string
Key2 string
}{
Key1: "value1",
Key2: "value2",
}
pr, pw := io.Pipe()
go func() {
err := json.NewEncoder(pw).Encode(data)
pw.CloseWithError(err)
}()
in := &s3manager.UploadInput{
Bucket: aws.String("bucket"),
Key: aws.String("sample.json"),
Body: pr,
}
if _, err := uploader.Upload(in); err != nil {
log.Fatal(err)
}
}
[ʻIo.Pipe () ](https://pkg.go.dev/io) to connect ʻio.Writer
for encoding JSON and ʻio.Readerpassed to
s3manager.UploadInput` #Pipe) must be used.
With Go CDK, writing is done with blob.Writer
(ʻio.Writer), so [
json.NewEncoder ()`](https://pkg.go.dev/encoding/json#NewEncoder) Just pass it to.
package main
import (
"context"
"encoding/json"
"log"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
data := struct {
Key1 string
Key2 string
}{
Key1: "value1",
Key2: "value2",
}
w, err := bucket.NewWriter(ctx, "sample.json", nil)
if err != nil {
log.Fatal(err)
}
defer w.Close()
if err := json.NewEncoder(w).Encode(data); err != nil {
log.Fatal(err)
}
}
Of course, you can simply write when uploading a local file. Just use ʻio.Copy` as if you were copying from file to file.
package main
import (
"context"
"io"
"log"
"os"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
file, err := os.Open("sample.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
w, err := bucket.NewWriter(ctx, "sample.txt", nil)
if err != nil {
log.Fatal(err)
}
defer w.Close()
if _, err := io.Copy(w, file); err != nil {
log.Fatal(err)
}
}
By the way, the Writer of s3blob
is implemented by wrapping s3manager.Uploader
, so you can benefit from the parallel upload function of s3manager.Uploader
.
Consider the case of getting JSON from S3 and decoding it.
For the AWS SDK, use s3.GetObject ()
.
S3manager.Downloader
, which is paired with s3manager.Uploader
, is the output destination. Note that you need to implement ʻio.WriterAt`, so you can't use it in this case.
package main
import (
"encoding/json"
"fmt"
"log"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
)
func main() {
sess, err := session.NewSession(&aws.Config{
Region: aws.String("ap-northeast-1"),
})
if err != nil {
log.Fatal(err)
}
svc := s3.New(sess)
data := struct {
Key1 string
Key2 string
}{}
in := &s3.GetObjectInput{
Bucket: aws.String("bucket"),
Key: aws.String("sample.json"),
}
out, err := svc.GetObject(in)
if err != nil {
log.Fatal(err)
}
defer out.Body.Close()
if err := json.NewDecoder(out.Body).Decode(&data); err != nil {
log.Fatal(err)
}
fmt.Printf("%+v\n", data)
}
For Go CDK, just write it in the opposite direction of uploading.
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
r, err := bucket.NewReader(ctx, "sample.json", nil)
if err != nil {
log.Fatal(err)
}
defer r.Close()
data := struct {
Key1 string
Key2 string
}{}
if err := json.NewDecoder(r).Decode(&data); err != nil {
log.Fatal(err)
}
fmt.Printf("%+v\n", data)
}
Use s3manager.Downloader
to write the retrieved object to a local file can do.
package main
import (
"log"
"os"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
func main() {
sess, err := session.NewSession(&aws.Config{
Region: aws.String("ap-northeast-1"),
})
if err != nil {
log.Fatal(err)
}
downloader := s3manager.NewDownloader(sess)
file, err := os.Create("sample.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
in := &s3.GetObjectInput{
Bucket: aws.String("bucket"),
Key: aws.String("sample.txt"),
}
if _, err := downloader.Download(file, in); err != nil {
log.Fatal(err)
}
}
In Go CDK, it is OK if you write it in the opposite way to the upload.
package main
import (
"context"
"io"
"log"
"os"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
file, err := os.Create("sample.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
r, err := bucket.NewReader(ctx, "sample.txt", nil)
if err != nil {
log.Fatal(err)
}
defer r.Close()
if _, err := io.Copy(file, r); err != nil {
log.Fatal(err)
}
}
However, although this method is simple, it has some drawbacks.
In the case of s3manager.Downloder
, instead of requesting ʻio.WriterAt to the output destination, it has a parallel download function and has excellent performance, but in the case of Go CDK, parallel download cannot be performed as it is. If you want to download in parallel with Go CDK, you need to implement it yourself using [
NewRangeReader ()`](https://pkg.go.dev/[email protected]/blob#Bucket.NewRangeReader). there is.
package main
import (
"context"
"errors"
"fmt"
"io"
"log"
"os"
"sync"
"gocloud.dev/blob"
_ "gocloud.dev/blob/s3blob"
)
const (
downloadPartSize = 1024 * 1024 * 5
downloadConcurrency = 5
)
func main() {
ctx := context.Background()
bucket, err := blob.OpenBucket(ctx, "s3://bucket")
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
file, err := os.Create("sample.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
d := &downloader{
ctx: ctx,
bucket: bucket,
key: "sample.txt",
partSize: downloadPartSize,
concurrency: downloadConcurrency,
w: file,
}
if err := d.download(); err != nil {
log.Fatal(err)
}
}
type downloader struct {
ctx context.Context
bucket *blob.Bucket
key string
opts *blob.ReaderOptions
partSize int64
concurrency int
w io.WriterAt
wg sync.WaitGroup
sizeMu sync.RWMutex
errMu sync.RWMutex
pos int64
totalBytes int64
err error
partBodyMaxRetries int
}
func (d *downloader) download() error {
d.getChunk()
if err := d.getErr(); err != nil {
return err
}
total := d.getTotalBytes()
ch := make(chan chunk, d.concurrency)
for i := 0; i < d.concurrency; i++ {
d.wg.Add(1)
go d.downloadPart(ch)
}
for d.getErr() == nil {
if d.pos >= total {
break
}
ch <- chunk{w: d.w, start: d.pos, size: d.partSize}
d.pos += d.partSize
}
close(ch)
d.wg.Wait()
return d.getErr()
}
func (d *downloader) downloadPart(ch chan chunk) {
defer d.wg.Done()
for {
c, ok := <-ch
if !ok {
break
}
if d.getErr() != nil {
continue
}
if err := d.downloadChunk(c); err != nil {
d.setErr(err)
}
}
}
func (d *downloader) getChunk() {
if d.getErr() != nil {
return
}
c := chunk{w: d.w, start: d.pos, size: d.partSize}
d.pos += d.partSize
if err := d.downloadChunk(c); err != nil {
d.setErr(err)
}
}
func (d *downloader) downloadChunk(c chunk) error {
var err error
for retry := 0; retry <= d.partBodyMaxRetries; retry++ {
err := d.tryDownloadChunk(c)
if err == nil {
break
}
bodyErr := &errReadingBody{}
if !errors.As(err, &bodyErr) {
return err
}
c.cur = 0
}
return err
}
func (d *downloader) tryDownloadChunk(c chunk) error {
r, err := d.bucket.NewRangeReader(d.ctx, d.key, c.start, c.size, d.opts)
if err != nil {
return err
}
defer r.Close()
if _, err := io.Copy(&c, r); err != nil {
return err
}
d.setTotalBytes(r.Size())
return nil
}
func (d *downloader) getErr() error {
d.errMu.RLock()
defer d.errMu.RUnlock()
return d.err
}
func (d *downloader) setErr(err error) {
d.errMu.Lock()
defer d.errMu.Unlock()
d.err = err
}
func (d *downloader) getTotalBytes() int64 {
d.sizeMu.RLock()
defer d.sizeMu.RUnlock()
return d.totalBytes
}
func (d *downloader) setTotalBytes(size int64) {
d.sizeMu.Lock()
defer d.sizeMu.Unlock()
d.totalBytes = size
}
type chunk struct {
w io.WriterAt
start int64
size int64
cur int64
}
func (c *chunk) Write(p []byte) (int, error) {
if c.cur >= c.size {
return 0, io.EOF
}
n, err := c.w.WriteAt(p, c.start+c.cur)
c.cur += int64(n)
return n, err
}
type errReadingBody struct {
err error
}
func (e *errReadingBody) Error() string {
return fmt.Sprintf("failed to read part body: %v", e.err)
}
func (e *errReadingBody) Unwrap() error {
return e.err
}
s3manager.Downloader
Go CDK is being developed to provide a local implementation for all services. Therefore, you can easily replace the operation of the cloud service with the local implementation. For example, on a local server for development, it is convenient to replace all services with local implementations so that they can operate without accessing AWS or GCP.
For the gocloud.dev / blob
package that handles cloud storage, [fileblob
](https://pkg. An implementation is provided that reads and writes local files (go.dev/[email protected]/blob/fileblob).
The following is an example of switching the output destination of encoded JSON to S3 and local depending on the option.
package main
import (
"context"
"encoding/json"
"flag"
"log"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"gocloud.dev/blob"
"gocloud.dev/blob/fileblob"
"gocloud.dev/blob/s3blob"
)
func main() {
var local bool
flag.BoolVar(&local, "local", false, "output to a local file")
flag.Parse()
ctx := context.Background()
bucket, err := openBucket(ctx, local)
if err != nil {
log.Fatal(err)
}
defer bucket.Close()
data := struct {
Key1 string
Key2 string
}{
Key1: "value1",
Key2: "value2",
}
w, err := bucket.NewWriter(ctx, "sample.json", nil)
if err != nil {
log.Fatal(err)
}
defer w.Close()
if err := json.NewEncoder(w).Encode(data); err != nil {
log.Fatal(err)
}
}
func openBucket(ctx context.Context, local bool) (*blob.Bucket, error) {
if local {
return openLocalBucket(ctx)
}
return openS3Bucket(ctx)
}
func openLocalBucket(ctx context.Context) (*blob.Bucket, error) {
return fileblob.OpenBucket("output", nil)
}
func openS3Bucket(ctx context.Context) (*blob.Bucket, error) {
sess, err := session.NewSession(&aws.Config{
Region: aws.String("ap-northeast-1"),
})
if err != nil {
return nil, err
}
return s3blob.OpenBucket(ctx, sess, "bucket", nil)
}
If you run it as is, sample.json
will be saved in S3, but if you run it with the -local
option, it will be saved in the local ʻoutput / sample.json. At this time, the property of the object is saved as ʻoutput / sample.json.attrs
. This makes it possible to get the properties of the saved object without any problem.
Code that calls APIs of external services like AWS always has a headache about how to make it a testable implementation, but with Go CDK you don't have to worry about that. Normally, you would abstract the external service as an interface to implement the mock, and replace it with the mock in the test ... but Go CDK already abstracts each service properly and provides its local implementation. Since it is done, you can just use it as it is.
For example, consider testing a structure that implements an interface for uploading encoded JSON to cloud storage, such as:
type JSONUploader interface {
func Upload(ctx context.Context, key string, v interface{}) error
In the case of AWS SDK, interfaces of various service clients are provided, so testability is guaranteed by using them.
For s3manager
, the interface is provided in a package called s3manageriface
.
type jsonUploader struct {
bucketName string
uploader s3manageriface.UploaderAPI
}
func (u *jsonUploader) Upload(ctx context.Context, key string, v interface{}) error {
pr, pw := io.Pipe()
go func() {
err := json.NewEncoder(pw).Encode(v)
pw.CloseWithError(err)
}()
in := &s3manager.UploadInput{
Bucket: aws.String(u.bucketName),
Key: aws.String(key),
Body: pr,
}
if _, err := u.uploader.UploadWithContext(ctx, in); err != nil {
return err
}
return nil
}
With such an implementation, you can test without actually accessing S3 by putting an appropriate mock in jsonUploader.uploader
. However, this mock implementation is not officially provided, so you will have to implement it yourself or find a suitable external package.
In the case of Go CDK, it becomes a structure with high testability just by implementing it as it is.
type jsonUploader struct {
bucket *blob.Bucket
}
func (u *jsonUploader) Upload(ctx context.Context, key string, v interface{}) error {
w, err := u.bucket.NewWriter(ctx, key, nil)
if err != nil {
return err
}
defer w.Close()
if err := json.NewEncoder(w).Encode(v); err != nil {
return err
}
return nil
}
For testing, it is useful to use the in-memory blob
implementation memblob
.
func TestUpload(t *testing.T) {
bucket := memblob.OpenBucket(nil)
uploader := &jsonUploader{bucket: bucket}
ctx := context.Background()
key := "test.json"
type data struct {
Key1 string
Key2 string
}
in := &data{
Key1: "value1",
Key2: "value2",
}
if err := uploader.Upload(ctx, key, in); err != nil {
t.Fatal(err)
}
r, err := bucket.NewReader(ctx, key, nil)
if err != nil {
t.Fatal(err)
}
out := &data{}
if err := json.NewDecoder(r).Decode(out); err != nil {
t.Fatal(err)
}
if !reflect.DeepEqual(in, out) {
t.Error("unmatch")
}
}
We introduced the benefits other than multi-cloud support and cloud portability by introducing Go CDK. Due to the nature of handling multiple cloud vendors in a unified manner, there are of course weaknesses such as not being able to use functions specific to a specific cloud vendor, so I think that the SDK of each cloud vendor will be used properly according to the requirements.
Go CDK itself is still developing, so I hope that it will have more functions in the future.