马春杰杰 Exit Reader Mode

Git删除历史记录中的大文件

场景:

某一次的commit不小心把大文件传上去了,后面的tagbranch都把这个大文件包含进来了,某一天重新git pull的时候才发现,速度好慢啊,没想到一个仓库竟然一百多M,这时才想把罪魁祸首给删除掉,不仅仅是当前分支,我们还想删除整个仓库中所有包含该文件的记录。

解决:

操作比较复杂,不过基本是无脑操作,只要按照以下步骤来就行了:

git gc
git count-objects -v
git verify-pack -v .git/objects/pack/pack-*.idx | sort -k 3 -n | tail -3
git rev-list --objects --all | grep bbbd19a8d4c87677cb0cf64833a5eb1ce4b95e40
git log --pretty=oneline --branches -- tools/analyze_weights_featuremap/aaa.npy
git filter-branch --force --index-filter 'git rm -rf --cached --ignore-unmatch tools/analyze_weights_featuremap/aaa.npy' --prune-empty --tag-name-filter cat -- --all
git push origin master --force
git for-each-ref --format='delete %(refname)' refs/original
git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
git reflog expire --expire=now --all
git gc --prune=now
git push origin --force --all
git push origin --force --tags

注意:这样操作之后,其他clone过的仓库都要重新git pull一下。

更新一下操作记录:

mcj@mcjdeiMac:~/backup_xxxxx$ git verify-pack -v .git/objects/pack/pack-*.idx | sort -k 3 -g | tail -3
c9d2aee6ea6badf89705bb6c46913d3c247b9737 blob   47520466 9867210 42228977
a979ada61643dccf098992f0b45107cd254ef7bc blob   55619712 21544630 8267238
18076d60846b7c14c7f44fa3d7eedd258f82e72d blob   104858618 79229339 63889446
mcj@mcjdeiMac:~/backup_xxxxx$ git rev-list --objects --all | grep  18076d60
18076d60846b7c14c7f44fa3d7eedd258f82e72d tools/analyze_weights_featuremap/outputs.npy
mcj@mcjdeiMac:~/backup_xxxxx$ git log --pretty=oneline --branches -- tools/analyze_weights_featuremap/outputs.npy
2b25df1d61800ae416a5f0dfc07de4e160faaa80 remove npy
244f6854fa4dcbc076ddf84b747bd6990d775d60 修改原来的配置文件
mcj@mcjdeiMac:~/backup_xxxxx$ git filter-branch --force --index-filter 'git rm -rf --cached --ignore-unmatch tools/analyze_weights_featuremap/outputs.npy' --prune-empty --tag-name-filter cat -- --all
WARNING: git-filter-branch has a glut of gotchas generating mangled history
	 rewrites.  Hit Ctrl-C before proceeding to abort, then use an
	 alternative filtering tool such as 'git filter-repo'
	 (https://github.com/newren/git-filter-repo/) instead.  See the
	 filter-branch manual page for more details; to squelch this warning,
	 set FILTER_BRANCH_SQUELCH_WARNING=1.
Proceeding with filter-branch...

Rewrite 244f6854fa4dcbc076ddf84b747bd6990d775d60 (67/138) (3 seconds passed, remaining 3 predicted)    rm 'tools/analyze_weights_featuremap/outputs.npy'
Rewrite 168bf72535453202649dc55deebb6ce6a98d3b4e (67/138) (3 seconds passed, remaining 3 predicted)    rm 'tools/analyze_weights_featuremap/outputs.npy'
Rewrite f10489779dc434b8f7b8880a4678b707a6c94d97 (136/138) (6 seconds passed, remaining 0 predicted)    
Ref 'refs/heads/master' was rewritten
Ref 'refs/remotes/origin/master' was rewritten
WARNING: Ref 'refs/remotes/origin/master' is unchanged
Ref 'refs/remotes/origin/share' was rewritten
Ref 'refs/remotes/origin/test' was rewritten
Ref 'refs/tags/fcos' was rewritten
Ref 'refs/tags/v1.0' was rewritten
Ref 'refs/tags/v2.0' was rewritten
Ref 'refs/tags/v2.1' was rewritten
Ref 'refs/tags/v3.0' was rewritten
Ref 'refs/tags/v4.0' was rewritten
v1.0 -> v1.0 (04fd678b165902398f1df0e118b43a578535befa -> f75f9ae97e78a51ef01491b310037b3a1d59476d)
v2.0 -> v2.0 (fe6efc51f8fbe1c84b5aee3d8ea2ccf2ce1abd8f -> 40c65d6de73d687579febadfb3931bef1a6adf6d)
v2.1 -> v2.1 (b807ef04b50a6dcc70e32df6e0be1926d36121ce -> 65843f7b5b60a404344992e379cc8c8348acfb0b)
v3.0 -> v3.0 (7aa281178245d9c8bc5d35b635633580ecf1498e -> 9e382e110aeb9decfad5aea21cad77eca3be2b74)
v4.0 -> v4.0 (3b85934d0c70ab94d262dd1631a21c76331b9f6e -> bd740539f8b0151b3edfb295d79d5481776b3726)
mcj@mcjdeiMac:~/t6$ git push origin master --force
Enumerating objects: 1770, done.
Counting objects: 100% (1770/1770), done.
Delta compression using up to 6 threads
Compressing objects: 100% (807/807), done.
Writing objects: 100% (1770/1770), 60.83 MiB | 4.65 MiB/s, done.
Total 1770 (delta 963), reused 1616 (delta 945)
remote: Resolving deltas: 100% (963/963), done.
remote: Powered by GITEE.COM [GNK-5.0]
To gitee.com:mcj686/xxxxx-_mcj.git
 + 71b86b0...8007b30 master -> master (forced update)
mcj@mcjdeiMac:~/t6$ git for-each-ref --format='delete %(refname)' refs/original
delete refs/original/refs/heads/master
delete refs/original/refs/remotes/origin/master
delete refs/original/refs/remotes/origin/share
delete refs/original/refs/remotes/origin/test
delete refs/original/refs/tags/v1.0
delete refs/original/refs/tags/v2.0
delete refs/original/refs/tags/v2.1
delete refs/original/refs/tags/v3.0
delete refs/original/refs/tags/v4.0
mcj@mcjdeiMac:~/t6$ git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
mcj@mcjdeiMac:~/t6$ git reflog expire --expire=now --all
mcj@mcjdeiMac:~/t6$ git gc --prune=now
Enumerating objects: 1953, done.
Counting objects: 100% (1953/1953), done.
Delta compression using up to 6 threads
Compressing objects: 100% (842/842), done.
Writing objects: 100% (1953/1953), done.
Total 1953 (delta 1093), reused 1837 (delta 1091)
Computing commit graph generation numbers: 100% (107/107), done.
mcj@mcjdeiMac:~/t6$ git push origin --force --all
Everything up-to-date
mcj@mcjdeiMac:~/t6$ git push origin --force --tags
Enumerating objects: 50, done.
Counting objects: 100% (50/50), done.
Delta compression using up to 6 threads
Compressing objects: 100% (25/25), done.
Writing objects: 100% (32/32), 4.78 KiB | 4.78 MiB/s, done.
Total 32 (delta 18), reused 19 (delta 6)
remote: Powered by GITEE.COM [GNK-5.0]
To gitee.com:mcj686/xxxx_mcj.git
 + db0be8e...c40258c v1.0 -> v1.0 (forced update)
 + d8cf4cd...eb854d0 v2.0 -> v2.0 (forced update)
 + c93d4cd...99c9ff6 v2.1 -> v2.1 (forced update)
 + e9a349b...8d25531 v3.0 -> v3.0 (forced update)
 + 8b68091...9e35598 v4.0 -> v4.0 (forced update)

 

本文最后更新于2020年12月9日,已超过 1 年没有更新,如果文章内容或图片资源失效,请留言反馈,我们会及时处理,谢谢!