Commit Graph

486 Commits (235fed326910a90b3322bc535d643a9003f8d7e1)
 

Author SHA1 Message Date
Elijah Newren 235fed3269 Makefile: avoid leaking GIT_INDEX_FILE shenanigans to additional commands
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 0cd8a1fd39 filter-repo: fix blob count when analyzing
Reported-by: Li Linchao
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 240ef0bcc2 README.md: clarify simple instruction rules and antecedent of `it`
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 933475ecf1 Make it clearer that --path* do not follow renames
The wording "exact paths" appears to not be clear enough for folks and I
keep repeatedly getting bug reports about filter-repo not following
renames.  Make it very explicit.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 05e3548b67 Merge branch 'rnd/add-report-dir-option'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
rndbit e9d5ab3529 filter-repo: add option --report-dir to set custom analysis dir
--analyze is hardcoded to write to a subdirectory inside GIT_DIR.

When practicing filtering runs on a large repo it is desirable to keep
an unchanged copy read-only to reduce chance of user error. It is
desirable to be able to analyze a read-only repo without having to clone
it. This would save a lot of time and space.

Add --report-dir option to set a non-default destination directory for
writing analysis output to.

Signed-off-by: rndbit <rndbit@filter.bitman.net>
[en: fixed existing regression test broken by now not overwriting the
     analysis directory unconditionally, and also added a new test of
     the new behavior for code coverage.]
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren a8ed6929d0 Merge branch 'rnd/fix-binary-blob-detection'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
rndbit 993216739e filter-repo: add tests for --replace-text in binary blobs
The --replace-text failed to detect blobs as binary and incorrectly
applied to all blobs.
Prior to switch from python2 to python3 it incorrectly designated blobs
containing 0 character instead of NUL byte as binary and would have been
causing text replacements to apply to binary files and not apply to text
files containing 0 character.

Add regression tests with blobs containing; 0 character, NUL byte, and
both 0 character and NUL byte.

Signed-off-by: rndbit <rndbit@filter.bitman.net>
3 years ago
rndbit 9cfe2b4090 filter-repo: fix detection of binary blobs for --replace-text
Detection if blob is binary for the purpose of --replace-text always
fails and text replacement is applied to all blobs. This has changed
going to python3. With python2 the same code would still be wrong but
would manifest differently.

In the construct 'for x in b"..."' the x is
 - of type <int> in python3
 - of type <str> in python2
thus in python3 condition 'x == b"\0"' can not be true for any x due to
type difference.

Further, the search was supposed to look for NUL byte and not 0
character, thus change to b"\0" instead of b"0".

Signed-off-by: rndbit <rndbit@filter.bitman.net>
3 years ago
Elijah Newren d8e858aeca Merge branch 'sr/fix-file-used-in-version-calculation'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Gwyneth Morgan 129a3bcb8b filter-repo: add new --replace-message option
Like --replace-text, add an option --replace-message which replaces text
in commit/tag message bodies, so that users can easily replace text
without constructing a --message-callback.

Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Stefano Rivera e7728c38ae Calculate the version from the module, not the entry_point
When git-filter-repo is installed, sys.argv[0] will be an entry-point
stub, not the relevant Python module.

Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Elijah Newren c5af37f82c Makefile: avoid releasing with uncommitted changes
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren fd64b0c7c0 INSTALL.md: more clarifications and links
Someone was surprised by my claim that someone else had reported
Microsoft provided a stub or stripped down python.  Link to where it was
reported in case others hit the same problem.

Vilius Šumskas reported that the need to edit the shebang line has been
corrected with the newest Git for Windows, so update the text to note
this.  It's possible other users may still have problems given the
variety of Windows versions and the number of reports I had about this,
so I want to still leave links there for at least a little while.

Be more explicit about how pip is lame and provides virtually no benefit
since it leaves you to fix your $PATH yourself, which was the only step
that was needed in installing the whole package anyway.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren a557077438 Merge branch 'bm/setup-py-entry-points'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Benjamin Motz 4ff15cd422 Use setup.py entry_points for installation
This should make the installation via pip more robust.

On Windows the usage of entry_points will install a wrapper executable
for the script that chooses the proper python executable. This
essentially makes the script run correctly when called via `git
filter-repo` (direct execution via `git-filter-repo` was already fine
before).

This fixes an issue on Windows, where the git-installation will choose a
different python executable than the one indicated by the installation
via `pip{x,3} install`.

Signed-off-by: Benjamin Motz <benjamin.motz@mailbox.org>
3 years ago
Elijah Newren 7ceb213f04 filter-repo: ensure we close files so they get written
It appears that python will usually write out files even if we do not
explicitly close them, but other tweaks to the code can make this not
happen.  Explicitly close the files to be safe.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 1c4551021f Merge branch 'cm/fix-documentation-typo'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Cody Martin 8abc4770e7 git-filter-repo.txt: fix typo in paths-from-file example
The "Filtering based on many paths" section includes this code snippet,
```
regex:^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}.txt$
```
and this text
```
files whose name
was of the form YYYY.MM-DD.txt at least two subdirectories deep
```
Update the text to YYYY-MM-DD.txt to correctly match the regex
in the code snippet.

Signed-off-by: Cody Martin <codytylermartin@gmail.com>
3 years ago
Elijah Newren 65ce5002fe Merge branch 'sl/mailmap-email-case'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Stefan Lietzau c9a9dcc886 filter-repo: ignore case for email address with mailmap
`git shortlog` ignores the case when matching the email address. As
such, `git filter-repo` should do the same.

Signed-off-by: Stefan Lietzau <lietzaustefan@gmail.com>
[en: fixed a small logic error, tweaked the commit message, and rebased]
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 7b09784d7b INSTALL.md: reference dscho's excellent python on git-for-windows fixes
Dscho made fixes to msys2, cygwin, git-for-windows, and contributed
several improvements to git-filter-repo that were merged in
js/windows-fixes.  Reference some of the fixes so that those who had
issues with git-filter-repo in the past may be willing to retry, and
update the installation instructions with relevant pointers.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 47c5a29fd4 Merge branch 'sb/callback-from-file'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Shezan Baig 5256c99e49 Allow callback body to be loaded from a file
For anything more complicated than a few lines, it's easier to write the
callback body in a file and let filter-repo load the file as a string.

Signed-off-by: Shezan Baig <sbaig1@bloomberg.net>
[en: added a testcase for code coverage]
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren a10fa46010 Merge branch 'sr/reusable-test-runner-script'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Stefano Rivera 24f09bd016 Share implementation with github workflow
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Stefano Rivera 26e3f8c52e Exit non-zero if the tests fail
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Stefano Rivera 34b26f4026 Break the actual test runner into its own script
So that we don't have to run with coverage if we don't want to.

Additionally, don't require being in the t directory to run tests

Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
3 years ago
Elijah Newren e5d8938d48 lint-history: explain how TMPDIR can be used
Some users may want to take advantage of setting TMPDIR to another
location that might be faster for the linting process.

Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren ccc37d3423 lint-history: explain filename paths
It was not clear for some users that the filenames would be relative
paths from the toplevel of the repository.  Add some text to explain
this.

Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren dc012d277b bfg-ish: add some sanity checks on the specified repo
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 06fa059744 Merge branch 'bl/bfg-ish-relative-paths'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
林博仁(Buo-ren Lin) e732141363 Fix relative path compatibility for --replace-text and bfg_args.repo
Users could specify relative paths on the command line, and then also
provide a directory other than '.' for the repo.  Since we did an
unconditional os.chdir() to move into the repo, that would invalidate
the original relative paths.  Fix that by changing the relative paths
into absolute paths.

Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
[en: tweaked commit message to explain the problem]
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 75e67bcd44 git-filter-repo.txt: link to GitHub docs on purging old history
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 12743def48 git-filter-repo.txt: add some clarifications around replace refs
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 8683d6fe48 Merge branch 'js/windows-fixes'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Johannes Schindelin fbaab1704c lint-history: do decode bytes
This fixes the "TypeError: a bytes-like object is required, not 'str'"
problem on Windows, letting t9391 pass.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
3 years ago
Johannes Schindelin e0a3df8c62 Fix the Python path on Windows
On Windows, we want to run with a native Python, i.e. the separator is a
semicolon, and the paths should be Windows paths (although they're
allowed to have forward slashes instead of backslashes).

Since we're most likely running this in an MSYS2 Bash, allow for
`$TEST_DIRECTORY` to pretend to be a Unix path, and translate it via
`cygpath` into a Windows path.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
3 years ago
Elijah Newren 3f181531df README.md: link to external formatting of user manual
Some people don't like htmlpreview.github.io.  I once or twice saw a
case where it appeared to be affected by load limits.  Since external
sites are making the manual available, and it's unlikely there are too
many changes between the last release and the current manual, just link
to it as an alternative for folks.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren d2fdc89ff3 filter-repo: avoid depending on `wc` binary being present
rev-list already has --count option anyway, so piping output to wc -l to
count the number of lines was a total waste of time.  Plus, it might
cause failures for the testsuite on some Windows boxes.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren cf67ccd978 filter-repo: improve invalid repository error message
Even though the repository is encoded as a bytestring, we want error
messages to be UTF-8.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 7500fb7c5a t9390: add a testcase for --path-rename with no colon
Commit 28b479b7 (Fix bug in --path-rename argument without colon,
2021-03-12) added a new conditional error message, with no corresponding
testcase to ensure the line was covered.  I forgot to check the coverage
before merging the change.  Add a relevant test now.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 97a1613f81 lint-history: fix binary blob detection
We had a lingering issue in the conversion from python2 to python3; as
reported by @thebrandre on GitHub:

    any(x==b'1' for x in b"123")
    # returns True in Python2 and False in Python3 because different
    # types are returned on iteration:
    [type(x) for x in b"123"]
    # Python2: [<type 'str'>, <type 'str'>, <type 'str'>]
    # Python3: [<class 'int'>, <class 'int'>, <class 'int'>]

Replace the
    any(x==b"0" for x in blob.data[0:8192])
construct with
    b"\0" in blob.data[0:8192]
to fix this.

Suggested-by: @thebrandre on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren cf84943982 Merge branch 'lk/path-rename-colon-count'
Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Lassi Kortela 28b479b79d Fix bug in --path-rename argument without colon
The --path-rename flag expected an argument with a colon
character (':') in it, which it assumed without checking. If the user
gave an argument with no colon in it, this backtrace would be shown:

  File "/usr/local/bin/git-filter-repo", line 1626, in __call__
    if values[0] and values[1] and not (
IndexError: list index out of range

Add a real error message in place of the backtrace.

Also check that there's exactly one colon; show an error message if
there's more than one, as that syntax has no interpretation that is
obviously the right one.

Signed-off-by: Lassi Kortela <lassi@lassi.io>
3 years ago
Elijah Newren 4987e0f6e3 filter-repo: fix --use-mailmap
--use-mailmap was defined as `--mailmap .mailmap` except that it would
set args.mailmap to ".mailmap" rather than b".mailmap" (in other words,
it accidentally set it to a string rather than a bytestring).  Since
the --mailmap parameter is always passed as a bytestring, we ran into
errors with calling unknown functions due to the type mismatch.

Signed-off-by: Elijah Newren <newren@gmail.com>
3 years ago
Elijah Newren 407d15dd29 Merge pull request #167 from dscho/meaow
Add a GitHub workflow for continuous testing

Signed-off-by: Elijah Newren <newren@gmail.com>
4 years ago
Johannes Schindelin d28b2a7346 Add a GitHub workflow to test this thing
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin d0dcece202 t9391: guard `dos2unix` use behind a prereq
Not all setups have `dos2unix`. Most notably, the Ubuntu and macOS
agents of GitHub Actions don't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago
Johannes Schindelin 85afdf9da9 t9391: don't rely on the system gitconfig defining core.autocrlf=false
The test case t9391.12 specifically wants to test LF vs CR/LF line
ending issues, expecting `core.autoCRLF` to default to `false`. This is
true on Linux and macOS and pretty much everywhere else, except on
Windows.

Let's make sure that the test operates with the `core.autoCRLF` value it
assumes to operate under.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
4 years ago