kibble-scanners.git
6 months agomain has recently become the default branch for git, so add that into the mix master
Daniel Gruno [Sun, 20 Mar 2022 19:14:09 +0000 (20:14 +0100)] 
main has recently become the default branch for git, so add that into the mix

15 months agoAccount for various connection hiccups that can happen
Daniel Gruno [Tue, 1 Jun 2021 12:24:01 +0000 (14:24 +0200)] 
Account for various connection hiccups that can happen

18 months agoMerge pull request #5 from erogol/patch-1
Tomek Urbaszek [Thu, 25 Mar 2021 13:07:24 +0000 (14:07 +0100)] 
Merge pull request #5 from erogol/patch-1

Update requirements.txt

18 months agoUpdate requirements.txt 5/head
Eren Gölge [Wed, 24 Mar 2021 10:43:23 +0000 (11:43 +0100)] 
Update requirements.txt

Missing ```pyyaml```

2 years agoRemove try/except here, so we can catch raw python/ES errors
Daniel Gruno [Sat, 2 May 2020 13:44:02 +0000 (08:44 -0500)] 
Remove try/except here, so we can catch raw python/ES errors

2 years agoUpdate supported plugins
Daniel Gruno [Fri, 1 May 2020 22:13:10 +0000 (17:13 -0500)] 
Update supported plugins

2 years agoWe don't use bttf any longer, nix it
Daniel Gruno [Fri, 1 May 2020 22:08:49 +0000 (17:08 -0500)] 
We don't use bttf any longer, nix it

2 years agoObvious fix: these parameters are never used, and cause forward compatibility issues...
Daniel Gruno [Fri, 1 May 2020 22:07:55 +0000 (17:07 -0500)] 
Obvious fix: these parameters are never used, and cause forward compatibility issues. So, nix them.

Thanks to ncg81 for pointing this out.

2 years agodon't count folders as building (even if sub-projects are)
Daniel Gruno [Fri, 27 Mar 2020 17:02:17 +0000 (12:02 -0500)] 
don't count folders as building (even if sub-projects are)

2 years agorework Jenkins scanners to work with org folders and multibranch jobs
Daniel Gruno [Fri, 27 Mar 2020 04:42:29 +0000 (23:42 -0500)] 
rework Jenkins scanners to work with org folders and multibranch jobs

2 years agoduration is in ms as well, so divide all of it by 1000
Daniel Gruno [Sun, 15 Mar 2020 16:38:15 +0000 (11:38 -0500)] 
duration is in ms as well, so divide all of it by 1000

3 years agofix old typo, pass on KibbleBit to allow for custom branch detection
Daniel Gruno [Wed, 17 Jul 2019 10:59:35 +0000 (12:59 +0200)] 
fix old typo, pass on KibbleBit to allow for custom branch detection

3 years agorefactor branch detection, allow for custom config
Daniel Gruno [Wed, 17 Jul 2019 10:59:10 +0000 (12:59 +0200)] 
refactor branch detection, allow for custom config

- Refactor branch detection to iterate over a set of wanted branches
- Allow said branches to be defined in the main yaml.

3 years agowork around rate limits by sleeping it off
Daniel Gruno [Wed, 17 Jul 2019 03:54:38 +0000 (05:54 +0200)] 
work around rate limits by sleeping it off

If we get hit by rate limits, first try to sleep it off in 5 minute
intervals till we hopefully get refreshed limits, otherwise bail
after a max of 2 hours of waiting.

3 years agoforgot to pass along params!
Daniel Gruno [Wed, 17 Jul 2019 03:53:28 +0000 (05:53 +0200)] 
forgot to pass along params!

3 years agotweak GitHub to work around abuse limitations
Daniel Gruno [Wed, 17 Jul 2019 03:50:30 +0000 (05:50 +0200)] 
tweak GitHub to work around abuse limitations

If we hit the abuse detection system, try to sleep it off a few times
before finally bailing.

3 years agoAdd timeouts for JSON requests
Daniel Gruno [Sun, 14 Jul 2019 04:59:44 +0000 (06:59 +0200)] 
Add timeouts for JSON requests

- Add a default, hardcoded connection timeout (this apparently can prevent hangups on some debian/ubuntu systems)
- Add a default timeout of 30 seconds for the read phase of the request as well.

3 years agoState who we actually are :)
Daniel Gruno [Fri, 14 Jun 2019 12:43:55 +0000 (14:43 +0200)] 
State who we actually are :)

3 years agoRaise a ConnectionError, so we can catch it
Daniel Gruno [Fri, 14 Jun 2019 12:06:42 +0000 (14:06 +0200)] 
Raise a ConnectionError, so we can catch it

scanners are expecting a ConnectionError exception on failure,
so let's make sure we raise that, and don't break scanning.

3 years agoMerge branch 'master' of github.com:apache/kibble-scanners
Daniel Gruno [Tue, 11 Jun 2019 19:36:45 +0000 (21:36 +0200)] 
Merge branch 'master' of github.com:apache/kibble-scanners

3 years agoonly fetch as json if not errored out.
Daniel Gruno [Tue, 11 Jun 2019 19:36:19 +0000 (21:36 +0200)] 
only fetch as json if not errored out.

We only expect JSON if we hit an okay response code.
for error responses, we are not interested in the JSON, raise exception
instead.

3 years agoconsistency is not always a thing, ignore it...
Daniel Gruno [Mon, 10 Jun 2019 15:53:20 +0000 (17:53 +0200)] 
consistency is not always a thing, ignore it...

3 years agocopypasto
Daniel Gruno [Mon, 10 Jun 2019 15:43:17 +0000 (17:43 +0200)] 
copypasto

3 years agoAdjust to work with ES >= 7.x (prepping for no doc_types at all in 8.x)
Daniel Gruno [Mon, 10 Jun 2019 15:42:41 +0000 (17:42 +0200)] 
Adjust to work with ES >= 7.x (prepping for no doc_types at all in 8.x)

3 years agoCommit timestamps should have H:M:S as well, not just days
Daniel Gruno [Sat, 27 Apr 2019 04:38:47 +0000 (23:38 -0500)] 
Commit timestamps should have H:M:S as well, not just days

3 years agoTry our best to respect and compromise on rate limiting, quietly fail otherwise
Daniel Gruno [Thu, 24 Jan 2019 21:17:10 +0000 (22:17 +0100)] 
Try our best to respect and compromise on rate limiting, quietly fail otherwise

3 years agofix dependency notes
Daniel Gruno [Wed, 9 Jan 2019 08:26:59 +0000 (09:26 +0100)] 
fix dependency notes

dateutils is for the server, not the scanner

3 years agoMove to using cloc 1.76 as minimum
Daniel Gruno [Wed, 9 Jan 2019 08:23:09 +0000 (09:23 +0100)] 
Move to using cloc 1.76 as minimum

- Change minimum required version of cloc for counting to 1.76
- Start utilizing multiprocessing for cloc
- Fix Readme to reflect this change, note that it's optional

3 years agoAdd command line options to readme
Daniel Gruno [Wed, 9 Jan 2019 08:13:03 +0000 (09:13 +0100)] 
Add command line options to readme

3 years agotickets with these values are broken and need a rescan
Daniel Gruno [Mon, 31 Dec 2018 05:25:37 +0000 (06:25 +0100)] 
tickets with these values are broken and need a rescan

Tickets with unknown@kibble are likely scanned with the
old (broken) scanner, and we should redo them when
encountered in the DB.

3 years agoFake user email if nto provided by JIRA
Daniel Gruno [Mon, 31 Dec 2018 04:50:32 +0000 (05:50 +0100)] 
Fake user email if nto provided by JIRA

In some instances, JIRA will have email visibility turned
off for the REST API. In such instances, we should use the
domain of the JIRA instance and the username (which is still
visible to us) to fake an email address. While not perfect,
this still allows us to get a good unique count of actors.

4 years agoif we've scanned before, only grab latest changes
Daniel Gruno [Wed, 19 Sep 2018 07:35:59 +0000 (09:35 +0200)] 
if we've scanned before, only grab latest changes

4 years agowe need to sleep to avoid abusing github
Daniel Gruno [Wed, 19 Sep 2018 07:19:57 +0000 (09:19 +0200)] 
we need to sleep to avoid abusing github

4 years agothis is extremely spammy when running manually :(
Daniel Gruno [Wed, 19 Sep 2018 07:12:51 +0000 (09:12 +0200)] 
this is extremely spammy when running manually :(

disable it for now

4 years agowe should just pass the response text..
Daniel Gruno [Wed, 19 Sep 2018 07:09:11 +0000 (09:09 +0200)] 
we should just pass the response text..

4 years agocould be no auth was set..
Daniel Gruno [Wed, 19 Sep 2018 07:01:59 +0000 (09:01 +0200)] 
could be no auth was set..

4 years agowe're only interested in label names for now
Daniel Gruno [Wed, 19 Sep 2018 06:58:02 +0000 (08:58 +0200)] 
we're only interested in label names for now

4 years agofix error logging
Daniel Gruno [Wed, 19 Sep 2018 06:55:17 +0000 (08:55 +0200)] 
fix error logging

wrong position of parens

4 years agowrong var name here
Daniel Gruno [Wed, 19 Sep 2018 06:39:35 +0000 (08:39 +0200)] 
wrong var name here

4 years agooops, typo
Daniel Gruno [Wed, 12 Sep 2018 11:40:52 +0000 (13:40 +0200)] 
oops, typo

4 years agoadd support for files changed as a list in each commit object
Daniel Gruno [Wed, 12 Sep 2018 10:16:09 +0000 (12:16 +0200)] 
add support for files changed as a list in each commit object

4 years agohandle closer key errors properly
Daniel Gruno [Sun, 9 Sep 2018 14:06:20 +0000 (16:06 +0200)] 
handle closer key errors properly

it's a dict...

4 years agocloser may be null for imported issues
Daniel Gruno [Sun, 9 Sep 2018 14:05:23 +0000 (16:05 +0200)] 
closer may be null for imported issues

4 years agobump default limits for tone/mood analysis
Daniel Gruno [Sun, 9 Sep 2018 14:03:50 +0000 (16:03 +0200)] 
bump default limits for tone/mood analysis

4 years agofix key errors
Daniel Gruno [Sun, 9 Sep 2018 14:03:18 +0000 (16:03 +0200)] 
fix key errors

sometimes the assignee/reporter is set but no email address is
available, so just try to get it non-fatally.

4 years agoprefer full name over username, if available
Daniel Gruno [Fri, 2 Mar 2018 18:51:32 +0000 (19:51 +0100)] 
prefer full name over username, if available

- only store shortened bio if it's new
- prefer full name over username if we find it.

4 years agoturns out this needs to be 0 for latest posts
Daniel Gruno [Fri, 2 Mar 2018 15:46:28 +0000 (16:46 +0100)] 
turns out this needs to be 0 for latest posts

4 years agoadd forum type, and a date-string field for histograms
Daniel Gruno [Fri, 2 Mar 2018 15:33:45 +0000 (16:33 +0100)] 
add forum type, and a date-string field for histograms

4 years agodon't kill the loop
Daniel Gruno [Fri, 2 Mar 2018 14:51:41 +0000 (15:51 +0100)] 
don't kill the loop

scope is wrong here. it should return True at the end,
if the loop is successful, and False within the loop when
an error occurs.

4 years agoWe can accept both version 1 and 2 atm
Daniel Gruno [Fri, 2 Mar 2018 14:46:53 +0000 (15:46 +0100)] 
We can accept both version 1 and 2 atm

...but 2 is obviously preferred!

4 years agoRewrite broker class, inherit kibble UI wrapper
Daniel Gruno [Fri, 2 Mar 2018 14:44:01 +0000 (15:44 +0100)] 
Rewrite broker class, inherit kibble UI wrapper

Instead of doing checks constantly, we'll inherit the
wrapper class from the UI repo. There are a few cases
where we still have to manually do if/else (bulk and
api checks), but the rest can be aliases.

Bump accepted DB version to 2.

4 years agoAlignm with new DB format
Daniel Gruno [Fri, 2 Mar 2018 11:07:45 +0000 (12:07 +0100)] 
Alignm with new DB format

if the DB is typeless, write to it and fetch from it
accordingly.

4 years agostore the person posting, d'uh.
Daniel Gruno [Mon, 26 Feb 2018 19:49:33 +0000 (20:49 +0100)] 
store the person posting, d'uh.

4 years agobe sure to actually store the post doc in ES
Daniel Gruno [Mon, 26 Feb 2018 19:47:54 +0000 (20:47 +0100)] 
be sure to actually store the post doc in ES

4 years agoremove spurious comment
Daniel Gruno [Mon, 26 Feb 2018 19:21:28 +0000 (20:21 +0100)] 
remove spurious comment

4 years agoUpdates to JSON API
Daniel Gruno [Mon, 26 Feb 2018 19:20:16 +0000 (20:20 +0100)] 
Updates to JSON API

- We want JSON, so specify that
- Sometimes we need a token (like for Travis) instead of basic auth

4 years agoAdd initial Discourse scanner plugin
Daniel Gruno [Mon, 26 Feb 2018 19:19:32 +0000 (20:19 +0100)] 
Add initial Discourse scanner plugin

4 years agoTry to grab the different states of 'started' jobs
Daniel Gruno [Fri, 23 Feb 2018 08:32:56 +0000 (09:32 +0100)] 
Try to grab the different states of 'started' jobs

hopefully I got this right and we'll be able to see the
diff between started and queued/blocked builds.

4 years agothis should use kibble's pprinter
Daniel Gruno [Wed, 21 Feb 2018 19:26:06 +0000 (20:26 +0100)] 
this should use kibble's pprinter

4 years agoneed to return true here
Daniel Gruno [Wed, 21 Feb 2018 19:24:53 +0000 (20:24 +0100)] 
need to return true here

this isn't a failed position to be in.

4 years agobreak if we hit the end
Daniel Gruno [Wed, 21 Feb 2018 19:23:12 +0000 (20:23 +0100)] 
break if we hit the end

continue is a bad choice, we need a clean break out of the loop

4 years agothe job URL needs to be consistent, tweak it
Daniel Gruno [Wed, 21 Feb 2018 18:37:45 +0000 (19:37 +0100)] 
the job URL needs to be consistent, tweak it

4 years agoreverse logic, fix a string
Daniel Gruno [Wed, 21 Feb 2018 18:29:58 +0000 (19:29 +0100)] 
reverse logic, fix a string

4 years agoscan all previous jobs, if that makes sense
Daniel Gruno [Wed, 21 Feb 2018 18:25:37 +0000 (19:25 +0100)] 
scan all previous jobs, if that makes sense

if we haven't scanned older jobs before, we scan them all.
This has some built-in logic that cancels a full scan
if we're on page 2 or above and find a build that we've seen
before, or if travis tells us to stop.

4 years agoqueue size should reflect multibuilds
Daniel Gruno [Wed, 21 Feb 2018 15:12:31 +0000 (16:12 +0100)] 
queue size should reflect multibuilds

some jobs have multiple builds, which all increase
the queue size (I'd think?). So let's multiply queue
size by number of concurrent jobs.

4 years agoAdd initial WIP Travis CI Scanner
Daniel Gruno [Wed, 21 Feb 2018 14:31:55 +0000 (15:31 +0100)] 
Add initial WIP Travis CI Scanner

H/T to Pono for the help here.

4 years agodon't bork if color isn't present
Daniel Gruno [Wed, 21 Feb 2018 09:48:25 +0000 (10:48 +0100)] 
don't bork if color isn't present

4 years agoallow for excluding scanner types with -e flag
Daniel Gruno [Wed, 21 Feb 2018 09:46:41 +0000 (10:46 +0100)] 
allow for excluding scanner types with -e flag

4 years agoalso count jobs building at the moment
Daniel Gruno [Mon, 19 Feb 2018 11:56:12 +0000 (12:56 +0100)] 
also count jobs building at the moment

4 years agoAdd in preliminary buildbot scanner
Daniel Gruno [Sat, 17 Feb 2018 09:29:55 +0000 (10:29 +0100)] 
Add in preliminary buildbot scanner

This isn't as advanced as the jenkins scanner (queues are
rather opaque in buildbot), but it'll show stuck builds,
as well as builds by duration/count.

4 years agowe need the jobURL, which is unique to a job
Daniel Gruno [Fri, 16 Feb 2018 18:25:32 +0000 (19:25 +0100)] 
we need the jobURL, which is unique to a job

This is so we can sum up build durations per job,
and sort them per CI (thus, if both Jenkins and Travis, for example, has
a job with the same name, we'll split them)

4 years agoadd a timestamp for when we think the build finished.
Daniel Gruno [Fri, 16 Feb 2018 18:18:17 +0000 (19:18 +0100)] 
add a timestamp for when we think the build finished.

4 years agoremove debug return
Daniel Gruno [Fri, 16 Feb 2018 18:16:10 +0000 (19:16 +0100)] 
remove debug return

we want this to actually scan :)

4 years agofix queue avg calc, add date key
Daniel Gruno [Fri, 16 Feb 2018 18:06:01 +0000 (19:06 +0100)] 
fix queue avg calc, add date key

avg needs to use max, not min.
add date, so we can do aggregations on dates

4 years agoAdd a generic Jenkins scanner
Daniel Gruno [Fri, 16 Feb 2018 12:13:35 +0000 (13:13 +0100)] 
Add a generic Jenkins scanner

This archives jobs done, queue size, avg wait time,
builds stuck/blocked and so on.

4 years agowe can't work on a jira ticket without fields data
Daniel Gruno [Mon, 15 Jan 2018 16:19:17 +0000 (17:19 +0100)] 
we can't work on a jira ticket without fields data

4 years agoIf not key phrases, put _NULL_ to avoid breaking ES
Daniel Gruno [Tue, 9 Jan 2018 01:37:27 +0000 (02:37 +0100)] 
If not key phrases, put _NULL_ to avoid breaking ES

ES does not seem to like empty sets here, so we'll
put _NULL_ in there when no phrases were found,
and ignore that in the UI.

4 years agobetter quote removal
Daniel Gruno [Tue, 9 Jan 2018 00:55:07 +0000 (01:55 +0100)] 
better quote removal

4 years agoBetter trimming of unnecessary text elements
Daniel Gruno [Tue, 9 Jan 2018 00:48:41 +0000 (01:48 +0100)] 
Better trimming of unnecessary text elements

We don't want to be analysing:
- quotes
- "on $date, bla bla wrote" sort of sentences
- URLs, email addresses

4 years agoforgot to add kpe to init.py
Daniel Gruno [Tue, 9 Jan 2018 00:32:31 +0000 (01:32 +0100)] 
forgot to add kpe to init.py

4 years agoInitial stab at KPE for Kibble
Daniel Gruno [Tue, 9 Jan 2018 00:29:09 +0000 (01:29 +0100)] 
Initial stab at KPE for Kibble

This only supports pony mail so far.
We'll have to work on support for Pipermail etc

4 years agoadditional emotional weighting available
Daniel Gruno [Fri, 29 Dec 2017 11:02:44 +0000 (12:02 +0100)] 
additional emotional weighting available

4 years agothere's a value for this too.
Daniel Gruno [Fri, 8 Dec 2017 18:39:14 +0000 (19:39 +0100)] 
there's a value for this too.

4 years agopicoAPI has scores for positivity/negativity, let's use those
Daniel Gruno [Fri, 8 Dec 2017 12:48:37 +0000 (13:48 +0100)] 
picoAPI has scores for positivity/negativity, let's use those

4 years agoweave picoAPI into pm-tone
Daniel Gruno [Fri, 8 Dec 2017 12:12:02 +0000 (13:12 +0100)] 
weave picoAPI into pm-tone

4 years agoadd support for picoAPI sentiment analysis
Daniel Gruno [Fri, 8 Dec 2017 12:11:05 +0000 (13:11 +0100)] 
add support for picoAPI sentiment analysis

The more the merrier

4 years agobump the limit from 1 email to 100 per scan at max
Daniel Gruno [Thu, 7 Dec 2017 10:36:10 +0000 (11:36 +0100)] 
bump the limit from 1 email to 100 per scan at max

4 years agoget ponymail-tone scanner to work with lib changes
Daniel Gruno [Thu, 7 Dec 2017 10:35:50 +0000 (11:35 +0100)] 
get ponymail-tone scanner to work with lib changes

grab all bodies, array them up, then scan them all at once

4 years agorework tone lib to accept an array of bodies
Daniel Gruno [Thu, 7 Dec 2017 10:35:24 +0000 (11:35 +0100)] 
rework tone lib to accept an array of bodies

Azure accepts up to 1000 bodies at the same time to
speed up and prevent rate limits, so let's make use of that.
also rework the watson to accept an array, even through
it's still one call per body.

4 years agoaccount for azure rate limiting
Daniel Gruno [Thu, 7 Dec 2017 10:13:59 +0000 (11:13 +0100)] 
account for azure rate limiting

4 years agoensure that azure returns a valid response
Daniel Gruno [Thu, 7 Dec 2017 09:58:15 +0000 (10:58 +0100)] 
ensure that azure returns a valid response

4 years agothis needs to be a string representation
Daniel Gruno [Thu, 7 Dec 2017 09:56:05 +0000 (10:56 +0100)] 
this needs to be a string representation

4 years agoadd commented out example watson/azure creds
Daniel Gruno [Wed, 6 Dec 2017 22:52:01 +0000 (23:52 +0100)] 
add commented out example watson/azure creds

4 years agoalso add azure text analysis option
Daniel Gruno [Wed, 6 Dec 2017 22:49:02 +0000 (23:49 +0100)] 
also add azure text analysis option

rename watson's to watsonTone.

4 years agocatch exception and store in db if we fail to scan
Daniel Gruno [Wed, 6 Dec 2017 11:13:25 +0000 (12:13 +0100)] 
catch exception and store in db if we fail to scan

4 years agoreport when we're done scanning
Daniel Gruno [Wed, 6 Dec 2017 11:12:02 +0000 (12:12 +0100)] 
report when we're done scanning

4 years ago^- (merghebegin work on a PoC twitter scanner
Daniel Gruno [Wed, 6 Dec 2017 11:11:02 +0000 (12:11 +0100)] 
^- (merghebegin work on a PoC twitter scanner

might be replaced with a streams container, but for now we'll
have something to work with, data-wise.

4 years agooverride alerts should cause a rewind attempt as well
Daniel Gruno [Mon, 27 Nov 2017 15:05:40 +0000 (16:05 +0100)] 
override alerts should cause a rewind attempt as well

When git says 'Your local changes to the following files would be
overwritten by checkout' we should probably try a rewind as well.

4 years agofail gracefully if watson breaks
Daniel Gruno [Tue, 24 Oct 2017 20:42:04 +0000 (22:42 +0200)] 
fail gracefully if watson breaks

let's do some better debug of this later on.

4 years agoneed to import the exceptions module
Daniel Gruno [Tue, 24 Oct 2017 17:55:14 +0000 (19:55 +0200)] 
need to import the exceptions module