Github 超過 100mb 檔案上傳方式

Github 超過 100mb 檔案上傳方式

在處理一些 Machine Learning 的過程中,我們有時候會碰到需要將一些大型的 Learning Data 隨著程序一起上傳到 Github 中,但是在 Github 本身是不接受超過 100mb 檔案的,它會出現以下的錯誤,這主要是因為 Github 想要避免因為大型檔案而增加其他用戶進行 Clone 的獲取時間,那麼有什麼方法可以解決這個問題呢?

1
2
3
4
5
6
7
8
remote: Resolving deltas: 100% (8/8), completed with 1 local object.
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: ada0de3f698c8f8a0d94664a827bf6ff
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File Day 4/HomeWork/data/application_train.csv is 158.44 MB; this exceeds GitHub's file size limit of 100.00 MB
To https://github.com/shuwn/3rd-ML100Days.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/shuwn/3rd-ML100Days.git'

Git Large File Storage 就是一個相當好的解決方式,它的作法簡單來說是在 commitpush 之前確認提交的文件是否符合 LFS 的要求,如果符合要求,就會將大型文件擺放到一個 Large Filte Storage,並從原本的 commit 和 push 中移除,轉而在原先文件內容替換成一個 Git-lfs hotlink 的配置。

安裝方式

Mac

在 Mac 中可以使用 Homebrew 進行安裝 (請先安裝完 Homebrew)

然後在 Terminal 中輸入以下指令

1
brew install git-lfs

當安裝完後可以使用以下指令確認是否安裝完成

1
git lfs version

應該會顯示如下內容 (版號會隨著更新而變動)
git-lfs/2.8.0 (GitHub; darwin amd64; go 1.12.7)

使用方式

  1. 在儲存庫目錄下執行以下指令(在每個想使用 Git-lfs 的儲存庫目錄下個別執行一次即可)
1
git lfs install
  1. 添加想要使用 Git-lfs 管理的文件類型(或直接編輯.gitattributes)。可以隨時配置其他文件名稱,這裡以 .csv 檔做舉例。備註:記得,這裡不需要添加路徑名。
1
git lfs track "*.csv"
  1. 確保有添加到 .gitattributes
1
git add .gitattributes
  1. 後續就像往常一樣使用 GitHub Desktop 或是 Git shell 指令
1
2
3
git add file.psd
git commit -m "Add design file"
git push origin master

如果在 push 過程中仍出現類似以下錯誤,請依照下章節解決錯誤

1
2
3
4
5
6
7
8
remote: Resolving deltas: 100% (8/8), completed with 1 local object.
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: ada0de3f698c8f8a0d94664a827bf6ff
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File Day 4/HomeWork/data/application_train.csv is 158.44 MB; this exceeds GitHub's file size limit of 100.00 MB
To https://github.com/shuwn/3rd-ML100Days.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/shuwn/3rd-ML100Days.git'

Push error 解決方式

  1. 檢查大型文件
1
git lfs migrate info

以下為反饋內容範例

l
1
2
3
4
5
6
7
migrate: Fetching remote refs: ..., done
migrate: Sorting commits: ..., done
migrate: Examining commits: 100% (5/5), done
*.csv 166 MB 2/2 files(s) 100%
*.pdf 9.8 MB 5/5 files(s) 100%
*.ipynb 1.2 MB 11/11 files(s) 100%
*.md 16 B 1/1 files(s) 100%
  1. 將大型文件類型或檔案導入 Git-lfs,並建立 Hotlink 配置
1
git lfs migrate import --include="*.csv"	

以下為反饋內容範例

1
2
3
4
5
6
igrate: Fetching remote refs: ..., done
migrate: Sorting commits: ..., done
migrate: Rewriting commits: 100% (5/5), done
master 9c40e4174d0ace4d978252f7c0ee785a3b17266a -> 00d72b7a188d84a217b436ad9331f577a7985d2c
migrate: Updating refs: ..., done
migrate: checkout: ..., done
  1. 重新檢查是否仍有大型文件
1
git lfs migrate info

以下為反饋內容範例

1
2
3
4
5
6
7
8
migrate: Fetching remote refs: ..., done
migrate: Sorting commits: ..., done
migrate: Examining commits: 100% (5/5), done
*.pdf 9.8 MB 5/5 files(s) 100%
*.ipynb 1.2 MB 11/11 files(s) 100%
*.csv 134 B 1/1 files(s) 100%
*.gitattributes 42 B 1/1 files(s) 100%
*.md 16 B 1/1 files(s) 100%
  1. 重新進行 Push,即可解決問題!
1
git push origin

以下為反饋內容範例

1
2
3
4
5
6
7
8
9
10
11
Uploading LFS objects:   0% (0/1), 0 B | 0 B/s 
Uploading LFS objects: 100% (1/1), 166 MB | 0 B/s, done
Enumerating objects: 29, done.
Counting objects: 100% (29/29), done.
Delta compression using up to 12 threads
Compressing objects: 100% (24/24), done.
Writing objects: 100% (28/28), 2.58 MiB | 412.00 KiB/s, done.
Total 28 (delta 5), reused 0 (delta 0)
remote: Resolving deltas: 100% (5/5), done.
To https://github.com/shuwn/3rd-ML100Days.git
c616224..00d72b7 master -> master