SuccessChanges

Summary

  1. Support specifying a different hostname for the database (commit: 09cad9f2c75b1d5882153509e0315f9df97f9c04) (details)
  2. Cleanup unused databases from the `get_database` module (commit: 2adae3bfcb52f9afef01417343ccbae09258871c) (details)
  3. Remove all references to public data from the codebase (commit: 333f0a3a750bef30dfe7252c1f4a0b8b2070b839) (details)
  4. Support authenticating with mongodb (commit: f69e7e920ac33dac4f1b3413cea9e5d0200a0a11) (details)
Commit 09cad9f2c75b1d5882153509e0315f9df97f9c04 by shankari
Support specifying a different hostname for the database
This wasn't so bad after all. There are still some other places which
have hardcoded `'localhost'`, primarily in the tests, but nothing that
we are using actively.
It is fine for the tests to have `localhost` hardcoded for now since
they are running locally. We can slowly move them to using collections
from
`get_database` as we rewrite the mode inference code
```
(emission) shankari$ find emission/ -name \*.py | xargs grep localhost
emission//analysis/classification/inference/mode.py:      backupSections
= MongoClient('localhost').Backup_database.Stage_Sections
emission//core/get_database.py:    # #current_db =
MongoClient('localhost').Stage_database emission//core/get_database.py:
  # #current_db = MongoClient('localhost').Stage_database
emission//core/get_database.py:    #
current_db=MongoClient('localhost').Stage_database
emission//incomplete_tests/TestAlternativeTripPipeline.py:  
self.serverName = 'localhost' emission//incomplete_tests/TestCarbon.py:
  self.serverName = 'localhost'
emission//incomplete_tests/TestDatabaseUtils.py:    self.serverName =
'localhost' emission//incomplete_tests/TestDatabaseUtils.py:  
purge_database_json.purgeData('localhost', self.testUser)
emission//incomplete_tests/TestDatabaseUtils.py:  
purge_database_json.purgeData('localhost', self.testUser)
emission//incomplete_tests/TestDatabaseUtils.py:  
dump_database_json.dumpData('localhost', '/tmp/testDumpFile')
emission//incomplete_tests/TestMovesCollect.py:    self.serverName =
'localhost' emission//incomplete_tests/TestProfile.py:  
self.serverName = 'localhost'
emission//incomplete_tests/TestRecommendationPipeline.py:  
self.serverName = 'localhost'
emission//incomplete_tests/TestUtilityModelPipeline.py:  
self.serverName = 'localhost' emission//net/api/bottle.py:    host, port
= (args.bind or 'localhost'), 8080 emission//net/api/cfc_webapp.py:    
# Non SSL option for testing on localhost
emission//net/api/wsgiserver2.py:        or IPv6 address, or any valid
hostname. The string 'localhost' is a emission//net/api/wsgiserver2.py:
                  # localhost won't work if we've bound to a public IP,
emission//tests/analysisTests/modeinferTests/TestPipeline.py:  
self.serverName = 'localhost'
emission//tests/analysisTests/modeinferTests/TestPipeline.py:    client
= MongoClient('localhost') emission//tests/common.py:    test_environ =
{'HTTP_REFERER': 'http://localhost:8080/',
emission//tests/coreTests/wrapperTests/TestClient.py:    self.serverName
= 'localhost'
(emission) C02KT61MFFT0:e-mission-server shankari$
```
(commit: 09cad9f2c75b1d5882153509e0315f9df97f9c04)
The file was addedconf/storage/db.conf.sample
The file was modifiedemission/core/get_database.py (diff)
The file was modifiedconf/net/api/webserver.conf.sample (diff)
Commit 2adae3bfcb52f9afef01417343ccbae09258871c by shankari
Cleanup unused databases from the `get_database` module
Remove obsolete and unused collection access methods from the `get_db()`
module. Obsolete, but currently used collections, such as
`get_section_db()` and `get_trip_db()` are retained until the obsolete
code that uses them is ported over.
Finally, we also remove `get_db()` which is superseded by
`_get_current_db()` and removes one more hardcoded instance of
`localhost` in the code, and remove all its instances.
With these changes, no more `get_db`
```
(emission) $ find . -name \*.py | xargs grep get_db
(emission) $
```
And only one `localhost` still left outside the tests
```
(emission) $ find emission/ -name \*.py | xargs grep localhost
emission//analysis/classification/inference/mode.py:      backupSections
= MongoClient('localhost').Backup_database.Stage_Sections
emission//core/get_database.py:    # #current_db =
MongoClient('localhost').Stage_database emission//core/get_database.py:
  # current_db=MongoClient('localhost').Stage_database
emission//incomplete_tests/TestAlternativeTripPipeline.py:  
self.serverName = 'localhost' emission//incomplete_tests/TestCarbon.py:
  self.serverName = 'localhost'
emission//incomplete_tests/TestDatabaseUtils.py:    self.serverName =
'localhost' emission//incomplete_tests/TestDatabaseUtils.py:  
purge_database_json.purgeData('localhost', self.testUser)
emission//incomplete_tests/TestDatabaseUtils.py:  
purge_database_json.purgeData('localhost', self.testUser)
emission//incomplete_tests/TestDatabaseUtils.py:  
dump_database_json.dumpData('localhost', '/tmp/testDumpFile')
emission//incomplete_tests/TestProfile.py:    self.serverName =
'localhost' emission//incomplete_tests/TestRecommendationPipeline.py:  
self.serverName = 'localhost'
emission//incomplete_tests/TestUtilityModelPipeline.py:  
self.serverName = 'localhost' emission//net/api/bottle.py:    host, port
= (args.bind or 'localhost'), 8080 emission//net/api/cfc_webapp.py:    
# Non SSL option for testing on localhost
emission//net/api/wsgiserver2.py:        or IPv6 address, or any valid
hostname. The string 'localhost' is a emission//net/api/wsgiserver2.py:
                  # localhost won't work if we've bound to a public IP,
emission//tests/analysisTests/modeinferTests/TestPipeline.py:  
self.serverName = 'localhost'
emission//tests/analysisTests/modeinferTests/TestPipeline.py:    client
= MongoClient('localhost') emission//tests/common.py:    test_environ =
{'HTTP_REFERER': 'http://localhost:8080/',
emission//tests/coreTests/wrapperTests/TestClient.py:    self.serverName
= 'localhost'
```
(commit: 2adae3bfcb52f9afef01417343ccbae09258871c)
The file was modifiedemission/tests/analysisTests/modeinferTests/TestPipeline.py (diff)
The file was modifiedemission/analysis/modelling/tour_model/cluster_pipeline.py (diff)
The file was removedemission/incomplete_tests/TestMovesCollect.py
The file was modifiedemission/tests/netTests/TestBuiltinUserCache.py (diff)
The file was modifiedemission/core/wrapper/client.py (diff)
The file was modifiedemission/core/wrapper/user.py (diff)
The file was modifiedemission/net/api/visualize.py (diff)
The file was modifiedemission/core/get_database.py (diff)
The file was modifiedemission/tests/coreTests/wrapperTests/TestClient.py (diff)
The file was removedemission/incomplete_tests/TestHomeDetection.py
The file was removedemission/incomplete_tests/TestWorkDetection.py
The file was modifiedemission/incomplete_tests/TestCarbon.py (diff)
The file was modifiedemission/tests/common.py (diff)
The file was modifiedemission/incomplete_tests/TestProfile.py (diff)
The file was modifiedemission/tests/netTests/TestBuiltinUserCacheHandlerInput.py (diff)
The file was modifiedemission/tests/netTests/TestBuiltinUserCacheHandlerOutput.py (diff)
The file was modifiedemission/incomplete_tests/TestRecommendationPipeline.py (diff)
The file was modifiedemission/tests/coreTests/wrapperTests/TestUser.py (diff)
The file was modifiedbin/purge_database.py (diff)
The file was modifiedemission/incomplete_tests/TestUtilityModelPipeline.py (diff)
Commit 333f0a3a750bef30dfe7252c1f4a0b8b2070b839 by shankari
Remove all references to public data from the codebase
Public data will now be treated just like regular data, only stored in a
different server. This removes special handling of public data from the
codebase and simplifies it.
Nothing left in `emission`
```
(emission)$ find emission/ -name \*.py | xargs grep -i public
emission//net/api/bottle.py:    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML
2.0//EN"> emission//net/api/wsgiserver2.py:                    #
localhost won't work if we've bound to a public IP,
emission//net/ext_service/gmaps/googlemaps.py:# Public License,
available in the accompanying LICENSE.txt file.
```
Nothing left in `bin` except the script to pull data.
```
(emission)$ find bin/ -name \*.py | xargs grep -i public
bin//public/request_public_data.py:# This script pulls public data from
the server and then loads it to a local server
bin//public/request_public_data.py:parser =
argparse.ArgumentParser(prog="request_public_data")
bin//public/request_public_data.py:# Pulling public data in batches
bin//public/request_public_data.py:import
emission.public.pull_and_load_public_data as plpd
```
The scripts to pull the data still exists because the batching is
potentially useful. It only works with skip authentication and can
either write to the database (as before) or save to a json file (new).
The hope is that we can eventually remove the dependency on the
e-mission-server codebase.
```
(emission)$ ./e-mission-py.bash bin/public/request_public_data.py --help
usage: request_public_data [-h] (-f OUTPUT_FILE | -d) [-v]
                          from_date to_date server_url phone_id key [key
...]
positional arguments:
from_date             from_date (local time, inclusive) in the format
of
                       YYYY-MM-DD-HH
to_date               to_date (local time, exclusive) in the format of
YYYY-
                       MM-DD-HH
server_url            url of the server to pull data from i.e.
                       'http://localhost:8080' or
                       'https://e-mission.eecs.berkeley.edu'
phone_id              the phone id to pull data for i.e.
'ucb.sdb.android.1'
                       or '4d21itu'
key                   the keys to pull data for i.e.
'background/battery'
                       'statemachine/transition'. Complete list is at
                       https://github.com/e-mission/e-mission-
                       server/blob/master/emission/core/wrapper/entry.py
optional arguments:
-h, --help            show this help message and exit
-f OUTPUT_FILE, --output_file OUTPUT_FILE
                       store to specified file
-d, --database        store to local database
-v, --verbose         turn on debugging
```
(commit: 333f0a3a750bef30dfe7252c1f4a0b8b2070b839)
The file was modifiedemission/storage/timeseries/aggregate_timeseries.py (diff)
The file was modifiedemission/tests/storageTests/TestTimeSeries.py (diff)
The file was modifiedemission/analysis/intake/cleaning/filter_accuracy.py (diff)
The file was modifiedemission/pipeline/scheduler.py (diff)
The file was modifiedbin/intake_multiprocess.py (diff)
The file was modifiedemission/storage/decorations/user_queries.py (diff)
The file was modifiedemission/public/pull_and_load_public_data.py (diff)
The file was modifiedbin/public/request_public_data.py (diff)
The file was modifiedemission/net/api/cfc_webapp.py (diff)
Commit f69e7e920ac33dac4f1b3413cea9e5d0200a0a11 by shankari
Support authenticating with mongodb
- Support using a url instead of a simple hostname
- Test the authentication by writing a separate test case
   - This requires mongodb to be started using --auth, and some tricky
stuff
     while running multiple tests, and restarting after test failures.
   - So the tests are checke into a separate directory and not run
automatically.
     They have to be run manually, one test at a time if/when
       `get_database.py` is changed.
They also form an example of how to set up various users that we can
refer to for our own testing, and in the best practices for others to
use
(commit: f69e7e920ac33dac4f1b3413cea9e5d0200a0a11)
The file was addedemission/integrationTests/__init__.py
The file was modifiedemission/core/get_database.py (diff)
The file was addedemission/integrationTests/storageTests/__init__.py
The file was modifiedemission/tests/storageTests/analysis_ts_common.py (diff)
The file was modifiedconf/storage/db.conf.sample (diff)
The file was addedemission/integrationTests/storageTests/TestMongodbAuth.py