Reputation:
After spending weeks doing analysis on my dataset, I come to realise something fishy that could actually impact the accuracy of my analysis (too bad).
In my table segments
containing user's trip segments splitted by user's travel mode, such that each segment has true value on its corresponding travel mode, I realised that some segments actually are having true values for more than one travel mode. This could be errors from users' annotation, for example:
SELECT trip_id, segment_id, true_foot, true_bike, true_bus, true_car, true_metro
FROM segments
WHERE trip_id=533888 limit 10;
trip_id | segment_id | true_foot | true_bike | true_bus | true_car | true_metro
---------+------------+-----------+-----------+----------+----------+------------
533888 | 55 | t | f | f | t | f
533888 | 53 | t | f | f | t | f
533888 | 51 | t | f | f | t | f
533888 | 49 | t | f | f | t | f
533888 | 47 | t | f | f | t | f
533888 | 45 | t | f | f | t | f
533888 | 43 | t | f | f | t | f
533888 | 41 | t | f | f | t | f
533888 | 37 | t | f | f | t | f
533888 | 35 | t | f | f | t | f
(10 rows)
In this case, the segment was annotated for both foot
and car
mode.
I'm left with no option than:
get the total number of such segments (if they aren't many, skipped them in my filtering)
somehow manually check with the users' to verify mode per segment (worst scenario: time consuming, underreporting, error prone).
How do I get the count of segments with more than one travel mode?
Upvotes: 0
Views: 513
Reputation: 26046
You can cast the boolean
to integer
and then count the rows that have true
in more than one column:
select count(*)
from segments
where (true_foot::integer + true_bike::integer + true_bus::integer + true_car::integer + true_metro::integer) > 1
Upvotes: 1