2024年08月08日玄貓（BlackCat）

EBS快照檔案還原與DataSync資料同步

本文介紹如何從EBS快照中還原檔案，以及如何使用AWS

儲存備份還原

EBS 快照 DataSync EFS S3 AWS

EBS 快照是 AWS 雲端環境中重要的資料備份和還原機制。當需要從 EBS 快照還原特設定檔案時，可以透過建立新的 EBS 磁碟區，並將其掛載到 EC2 執行個體來達成。這個過程涉及快照建立、磁碟區建立與掛載、檔案複製以及後續的清理工作。此外，AWS DataSync 提供了在不同儲存服務之間同步資料的便捷方法，例如在 Amazon EFS 和 Amazon S3 之間進行資料複製。透過 DataSync，可以簡化資料遷移和同步的流程，並確保資料的一致性和完整性。以下將詳細說明如何使用 AWS CLI 和相關服務完成這些操作。

從EBS快照還原檔案

問題描述

您需要從帳戶中的EBS磁碟區所建立的快照中還原檔案。

解決方案

從快照建立磁碟區，將磁碟區掛載到EC2例項，並將檔案複製到例項磁碟區（參見圖3-22）。

流程圖示

此圖示展示了從快照還原檔案的流程。

準備工作

遵循章節程式碼儲存函式庫中本配方資料夾中的步驟。

步驟

查詢EC2例項所掛載的EBS磁碟區ID

ORIG_VOLUME_ID=$(aws ec2 describe-volumes \
--filters Name=attachment.instance-id,Values=$INSTANCE_ID \
--output text \
--query Volumes[0].Attachments[0].VolumeId)

建立EBS磁碟區的快照

SNAPSHOT_ID=$(aws ec2 create-snapshot \
--volume-id $ORIG_VOLUME_ID \
--output text --query SnapshotId)

從快照建立新的磁碟區並儲存ID

SNAP_VOLUME_ID=$(aws ec2 create-volume \
--snapshot-id $SNAPSHOT_ID \
--size 8 \
--volume-type gp2 \
--availability-zone us-east-1a \
--output text --query VolumeId)

將新磁碟區掛載到EC2例項

aws ec2 attach-volume --volume-id $SNAP_VOLUME_ID \
--instance-id $INSTANCE_ID --device /dev/sdf

等待磁碟區掛載完成

aws ec2 describe-volumes \
--volume-ids $SNAP_VOLUME_ID

連線到EC2例項並掛載新磁碟區
- 使用SSM Session Manager連線至EC2例項：
```
aws ssm start-session --target $INSTANCE_ID
```
- 執行lsblk命令檢視磁碟區：
```
lsblk
```
- 建立掛載點並掛載新磁碟區：
```
sudo mkdir /mnt/restore
sudo mount -t xfs -o nouuid /dev/nvme1n1p1 /mnt/restore
```

程式碼解析：

lsblk 命令：用於列出區塊裝置的資訊，幫助識別新掛載的磁碟區。
掛載命令中的 -o nouuid 引數：由於XFS檔案系統使用UUID來識別檔案系統，直接掛載具有相同UUID的磁碟區會被系統阻止。使用-o nouuid引數可以覆寫這一檢查，允許掛載具有相同UUID的檔案系統。

複製所需檔案並解除安裝磁碟區

sudo cp /mnt/restore/home/ec2-user/.bash_profile \
/tmp/.bash_profile.restored
sudo umount /dev/nvme1n1p1

清理工作

遵循章節程式碼儲存函式庫中本配方資料夾中的步驟。

討論

EBS快照是EC2服務中備份策略的重要組成部分。透過建立快照，您可以在需要時還原EC2例項至特定時間點。您也可以從快照建立EBS磁碟區並將其掛載到執行中的例項上，以還原個別檔案。

挑戰：從建立的快照中建立AMI並啟動新例項

使用DataSync在EFS和S3之間複製資料

問題描述

您需要將檔案從Amazon S3複製到Amazon EFS。

解決方案

組態AWS DataSync，使用S3作為來源，EFS作為目標；然後建立DataSync任務並啟動複製任務，如圖3-23所示。

準備工作

遵循章節程式碼儲存函式庫中本配方資料夾中的步驟。

步驟

建立IAM角色並附加必要策略

S3_ROLE_ARN=$(aws iam create-role --role-name AWSCookbookS3LocationRole \
--assume-role-policy-document file://assume-role-policy.json \
--output text --query Role.Arn)

aws iam attach-role-policy --role-name AWSCookbookS3LocationRole \
--policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

建立DataSync S3位置

S3_LOCATION_ARN=$(aws datasync create-location-s3 \
--s3-bucket-arn $BUCKET_ARN \
--s3-config BucketAccessRoleArn=$S3_ROLE_ARN \
--output text --query LocationArn)

程式碼解析：

create-location-s3 命令：用於為DataSync任務建立S3儲存位置，需提供S3 Bucket的ARN和存取角色ARN。
attach-role-policy 命令：為IAM角色附加必要的策略，確保DataSync能夠存取S3 Bucket。

使用 DataSync 在 EFS 和 S3 之間複製資料

4. 使用提供的 assume-role-policy.json 檔案中的陳述式建立 IAM 角色

使用以下命令建立 IAM 角色並取得其 ARN：

EFS_ROLE_ARN=$(aws iam create-role --role-name AWSCookbookEFSLocationRole \
--assume-role-policy-document file://assume-role-policy.json \
--output text --query Role.Arn)

內容解密：

aws iam create-role：建立新的 IAM 角色。
--role-name AWSCookbookEFSLocationRole：指定 IAM 角色的名稱。
--assume-role-policy-document file://assume-role-policy.json：指定信任策略檔案，用於定義哪些服務可以擔任該角色。
--output text --query Role.Arn：輸出建立的 IAM 角色的 ARN。

5. 將 AmazonElasticFileSystemClientReadWriteAccess IAM 受管策略附加到 IAM 角色

使用以下命令將必要的許可權附加到 IAM 角色：

aws iam attach-role-policy --role-name AWSCookbookEFSLocationRole \
--policy-arn arn:aws:iam::aws:policy/AmazonElasticFileSystemClientFullAccess

內容解密：

aws iam attach-role-policy：將 IAM 策略附加到指定的 IAM 角色。
--role-name AWSCookbookEFSLocationRole：指定要附加策略的 IAM 角色名稱。
--policy-arn：指定要附加的 IAM 策略 ARN。

6. 取得 EFS 檔案系統的 ARN

使用以下命令取得 EFS 檔案系統的 ARN：

EFS_FILE_SYSTEM_ARN=$(aws efs describe-file-systems \
--file-system-id $EFS_ID \
--output text --query FileSystems[0].FileSystemArn)

內容解密：

aws efs describe-file-systems：描述指定的 EFS 檔案系統。
--file-system-id $EFS_ID：指定要查詢的 EFS 檔案系統 ID。
--output text --query FileSystems[0].FileSystemArn：輸出 EFS 檔案系統的 ARN。

建立 DataSync EFS 位置和任務

使用以下命令建立 DataSync EFS 位置和任務，並執行資料同步：

EFS_LOCATION_ARN=$(aws datasync create-location-efs \
--efs-filesystem-arn $EFS_FILE_SYSTEM_ARN \
--ec2-config SubnetArn=$SUBNET_ARN,SecurityGroupArns=[$SG_ARN] \
--output text)

TASK_ARN=$(aws datasync create-task \
--source-location-arn $S3_LOCATION_ARN \
--destination-location-arn $EFS_LOCATION_ARN \
--output text --query TaskArn)

aws datasync start-task-execution \
--task-arn $TASK_ARN

內容解密：

aws datasync create-location-efs：為 EFS 建立 DataSync 位置。
aws datasync create-task：建立 DataSync 任務，用於在 S3 和 EFS 之間同步資料。
aws datasync start-task-execution：啟動 DataSync 任務執行。

連線到 EC2 執行個體並驗證資料同步結果

使用 SSM Session Manager 連線到 EC2 執行個體，並檢查 EFS 掛載目錄中的檔案：

aws ssm start-session --target $INSTANCE_ID

sh-4.2$ cd /mnt/efs
sh-4.2$ ls -al

內容解密：

aws ssm start-session：使用 SSM Session Manager 連線到指定的 EC2 執行個體。
cd /mnt/efs：切換到 EFS 掛載目錄。
ls -al：列出目錄中的檔案和詳細資訊。

使用 DataSync 的優點和挑戰

DataSync 提供了一種簡單、可靠的方式來在 AWS 服務之間同步檔案資料。它保留了檔案的中繼資料，並在同步過程中檢查檔案完整性，支援重試機制。這對於需要在不同資料來源和目標之間行動資料的開發人員和雲端工程師非常有用。

挑戰1：設定 DataSync 任務以排除特定資料夾中的檔案名稱

您可以透過在 DataSync 任務組態中指定排除篩選器來實作此目的。

挑戰2：設定排程的 DataSync 任務以每小時將資料從 S3 複製到 EFS

DataSync 目前支援的最短自動同步排程間隔為一小時。您可以參考 DataSync 使用者檔案以瞭解更多詳細資訊。

使用 Amazon Aurora Serverless 建立 PostgreSQL 資料函式庫

問題描述

您的 Web 應用程式接收到不可預測的請求，需要在關聯式資料函式庫中儲存資料。您需要一個能夠隨著使用量而擴充套件且具成本效益的資料函式庫解決方案。您希望建立一個具有低操作負擔且與現有的 PostgreSQL 資料函式庫應用程式相容的解決方案。

解決方案

設定並建立 Aurora Serverless 資料函式庫叢集，並使用複雜的密碼。然後，套用自訂的擴充套件組態，並在閒置後啟用自動暫停。擴充套件活動會根據設定的原則進行調整，如圖 4-1 所示。

圖 4-1：Aurora Serverless 叢集擴充套件計算資源

準備工作

在兩個可用區域中建立具有隔離子網路的 VPC，並關聯路由表。
佈署 EC2 執行個體，您需要連線到此執行個體進行測試。
複製本章節的程式碼儲存函式庫：

git clone https://github.com/AWSCookbook/Databases

步驟

使用 AWS Secrets Manager 產生複雜的密碼：

ADMIN_PASSWORD=$(aws secretsmanager get-random-password \
--exclude-punctuation \
--password-length 41 --require-each-included-type \
--output text \
--query RandomPassword)

內容解密：

使用 aws secretsmanager get-random-password 命令產生複雜密碼。
--exclude-punctuation 引數排除了特殊字元，因為 PostgreSQL 不支援它們。
--password-length 41 設定密碼長度為 41 個字元。
--require-each-included-type 確保密碼包含至少一個大寫字母、一個小寫字母、一個數字和一個特殊字元（除了被排除的標點符號）。

建立資料函式庫子網路群組，指定 VPC 子網路以供叢集使用：

aws rds create-db-subnet-group \
--db-subnet-group-name awscookbook401subnetgroup \
--db-subnet-group-description "AWSCookbook401 subnet group" \
--subnet-ids $SUBNET_ID_1 $SUBNET_ID_2

內容解密：

使用 aws rds create-db-subnet-group 命令建立資料函式庫子網路群組。
--db-subnet-group-name 指定子網路群組的名稱。
--subnet-ids 指定用於資料函式庫的子網路 ID。

為資料函式庫建立 VPC 安全群組：

DB_SECURITY_GROUP_ID=$(aws ec2 create-security-group \
--group-name AWSCookbook401sg \
--description "Aurora Serverless Security Group" \
--vpc-id $VPC_ID --output text --query GroupId)

內容解密：

使用 aws ec2 create-security-group 命令建立安全群組。
--group-name 和 --description 分別指定安全群組的名稱和描述。
--vpc-id 指定安全群組所屬的 VPC。

建立資料函式庫叢集，指定引擎模式為 serverless：

aws rds create-db-cluster \
--db-cluster-identifier awscookbook401dbcluster \
--engine aurora-postgresql \
--engine-mode serverless \
--engine-version 10.14 \
--master-username dbadmin \
--master-user-password $ADMIN_PASSWORD \
--db-subnet-group-name awscookbook401subnetgroup \
--vpc-security-group-ids $DB_SECURITY_GROUP_ID

內容解密：

使用 aws rds create-db-cluster 命令建立資料函式庫叢集。
--engine-mode serverless 指定使用 Serverless 模式。
--master-username 和 --master-user-password 分別指定主使用者名稱和密碼。

等待叢集狀態變為 available：

aws rds describe-db-clusters \
--db-cluster-identifier awscookbook401dbcluster \
--output text --query DBClusters[0].Status

內容解密：

使用 aws rds describe-db-clusters 命令查詢叢集狀態。
--db-cluster-identifier 指定要查詢的叢集 ID。

修改資料函式庫以自動擴充套件，並啟用 AutoPause：

aws rds modify-db-cluster \
--db-cluster-identifier awscookbook401dbcluster --scaling-configuration \
MinCapacity=8,MaxCapacity=16,SecondsUntilAutoPause=300,TimeoutAction='ForceApplyCapacityChange',AutoPause=true

內容解密：

使用 aws rds modify-db-cluster 命令修改叢集組態。
--scaling-configuration 指定擴充套件組態，包括最小和最大容量、自動暫停時間等。

允許 EC2 執行個體的安全群組存取預設的 PostgreSQL 連線埠：

aws ec2 authorize-security-group-ingress \
--protocol tcp --port 5432 \
--source-group $INSTANCE_SG \
--group-id $DB_SECURITY_GROUP_ID

內容解密：

使用 aws ec2 authorize-security-group-ingress 命令允許安全群組的入站規則。
--protocol tcp --port 5432 指定允許 TCP 連線埠 5432 的流量。