user12587364
user12587364

Reputation:

Dealing with Boolean value across multiple columns

After spending weeks doing analysis on my dataset, I come to realise something fishy that could actually impact the accuracy of my analysis (too bad).

In my table segments containing user's trip segments splitted by user's travel mode, such that each segment has true value on its corresponding travel mode, I realised that some segments actually are having true values for more than one travel mode. This could be errors from users' annotation, for example:

SELECT trip_id, segment_id, true_foot, true_bike, true_bus, true_car, true_metro 
FROM segments 
WHERE trip_id=533888 limit 10;
 trip_id | segment_id | true_foot | true_bike | true_bus | true_car | true_metro 
---------+------------+-----------+-----------+----------+----------+------------
  533888 |         55 | t         | f         | f        | t        | f
  533888 |         53 | t         | f         | f        | t        | f
  533888 |         51 | t         | f         | f        | t        | f
  533888 |         49 | t         | f         | f        | t        | f
  533888 |         47 | t         | f         | f        | t        | f
  533888 |         45 | t         | f         | f        | t        | f
  533888 |         43 | t         | f         | f        | t        | f
  533888 |         41 | t         | f         | f        | t        | f
  533888 |         37 | t         | f         | f        | t        | f
  533888 |         35 | t         | f         | f        | t        | f
(10 rows)

In this case, the segment was annotated for both foot and car mode.

I'm left with no option than:

  1. get the total number of such segments (if they aren't many, skipped them in my filtering)

  2. somehow manually check with the users' to verify mode per segment (worst scenario: time consuming, underreporting, error prone).

How do I get the count of segments with more than one travel mode?

Upvotes: 0

Views: 513

Answers (1)

Andronicus
Andronicus

Reputation: 26046

You can cast the boolean to integer and then count the rows that have true in more than one column:

select count(*)
from segments
where (true_foot::integer + true_bike::integer + true_bus::integer + true_car::integer + true_metro::integer) > 1

Upvotes: 1

Related Questions