Chaz
Chaz

Reputation: 83

Find Magic Numbers C++

Magic Numbers

A positive integer is “magic” if, and only if, it can be reduced to 1 by repeatedly dividing it by 2 if it’s even or multiplying it by 3 and then adding 1 if it’s odd. So, for example, 3 is magic because 3 reduces first to 10 (3*3+1), then to 5 (10/2), then to 16 (5*3+1), then to 8 (16/2), then to 4 (8/2), then to 2 (4/2), and finally to 1 (2/2). The magic numbers hypothesis states that all positive integers are magic, or, formally: ∀x ∈ Z, MAGIC(x) where MAGIC(x) is the predicate “x is magic”.

We are supposed to develop a C++ program to find "Magic Numbers" from 1 to 5 Billion. It is supposed to take 80 seconds or less if being done correctly. Mine takes approximately 2 hours, and the example program we were given would take 14 days. Here is my code, what can I do to speed it up? Am I missing obvious optimization issues?

#include <iostream>
using namespace std;

bool checkMagic(unsigned long number);

int main()
{
  for(unsigned long i = 1; i <= 5000000000; i++)
  {
    if(i % 1000000 == 0)
    {
      //only print out every 1000000 numbers to keep track, but slow it down 
      //by printing out every single number
      cout<<i<<endl;
    }

    if(!checkMagic(i))
    {
      //not magic
      cout<<i<<" not a magic number"<<endl;
      break;
    }
  }
}

bool checkMagic(unsigned long number)
{
  if(number % 2 == 0)
  {
    number = number / 2;
  }
  else
  {
    number = (number * 3) + 1;
  }

  if(number !=  1)
  {
    checkMagic(number);
  }

  return 1;
}

Upvotes: 8

Views: 4855

Answers (4)

2785528
2785528

Reputation: 5566

I was speculating about what changes the optimizer might have made for this particular tail recursion.

So I created an iterative answer from my V9 code (in my other answer).

Change From:

bool checkMagicR(uint64_t n) 
   {
      //.. replace all 
   }

Change To:

// ITERATIVE VERSION 
bool checkMagicR(uint64_t n)   // ITERATIVE Performance
   {
      do
      {  uint64_t nxt;

         if (n & B00) // n-odd
         {
            if (n >= m_upperLimitN) // wraparound imminent abort
               return (false);    // recursion termination clause.

            // NOTE: the sum of 4 odd uint's is even, and will be halved
            // we need not wait for the next recursion to divide-by-2
            nxt = ((n+n+n+ONE)  >> ONE); // combine the arithmetic
            //     known even
         }
         else // n-even
         {
            nxt = (n >> ONE);   // bit shift divide by 2

            if (nxt < m_N)      // nxt < m_N start of recursion
               return ( true );     // recursion termination clause.
            // We need not de-curse to 1 because all
            // (n < m_N) are asserted as magic
         }

         n = nxt;
      } while(true);  // NO  recursion
   }

Did not fix the name. Did not fix comments or other mentioning of 'recursion' in this method or others. I changed some comments at top and bottom of this method.

Results:

  • Performance is identical. (44 seconds)

  • Mode 1 and mode 2 are both iterative.

  • Mode 3 is (still) recursive.

Upvotes: 0

2785528
2785528

Reputation: 5566

what can I do to speed it up? Am I missing obvious optimization issues?

  • for(unsigned long i = 1; i <= 5000000000; i++)

A for-loop is not the fastest coding style to invoke a method 5B times.

In my code see unwind100() and innerLoop(). The unwind'ing I coded removed 99% of the innerLoop overhead, which saved more than 5 seconds (on my desktop DD5). Your savings may vary. Wikipedia has an article about unwinding a loop, with a brief description of the savings potential.


  • if (i % 1000000 == 0)

This code apparently generates progress reports. At a cost of 5 billion comparisons, it accomplishes nothing for the magic number checks.

See my code outerLoop, which calls innerLoop 10 times. This provides a convenient location for 1 progress report each outer loop. Consider splitting your loops into an 'innerLoop' and an 'outerLoop'. Testing 5 Billion times is more expensive than adding 10 calls to innerLoops.


  • As in the comments, your checkMagic() return's '1' regardless of test results. A simple error, but I do not know if this error affects performance.

I think your checkMagic() does not achieve tail-recursion, which really will slow your code down.

See my method checkMagicR(), which has tail recursion, and returns a true or false.

Also, in your CheckMagic(), in the clause for odd input, you ignore the idea that (3n+1) must be an even number when n is positive and odd. My code performs an 'early-div-by-2', thus reducing the future effort when possible.


  • In my methods unwind10(), note that my code exploits the known even/odd pattern of input the for the sequence (2..5 Billion), by using checkMagicRO() (for odd inputs) and checkMagicRE() (for even).

In my earlier versions, the code wasted time detecting that the known even input was even, then dividing it by 2, and then returning true. checkMagicRE() skips the test and divide, saving time.

CheckMagicRO() skips the test for odd, then performs the odd stuff and continues into checkMagicR().


  • if(number != 1) { checkMagic(number); }

Your recursion continues all the way down to 1 ... not necessary, and potentially the biggest hit to overall performance.

See my checkMagicR() "recursion termination clause". My code stops when ((n/2) < m_N), m_N being the test value that started this recursion. The idea is that all smaller test inputs are already proven magic, else the code would have asserted. It also halts when wraparound is imminent ... which has not happened.


Results:

My desktop (DD5) is older, slower, and running Ubuntu 15.10 (64). This code was developed with g++ v5.2.1, and only finishes without timeout assert using -O3.

DD5 is apparently 2x slower than the machine used by Amy Tavory (comparing how his code ran on DD5).

On DD5, my code completes in ~44 seconds in mode 1, and 22 seconds in mode 2.

A faster machine might complete this code in 11 seconds (in mode 2).


Full Code:

// V9

#include <chrono>
#include <iomanip>
#include <iostream>
#include <limits>
#include <sstream>
#include <thread>
#include <vector>
#include <cassert>

// chrono typedef - shorten one long name
typedef std::chrono::high_resolution_clock  Clock_t;


class T431_t
{

private:
   uint64_t           m_N;             // start-of-current-recursion
   const uint64_t     m_upperLimitN;   // wrap-around avoidance / prevention

   std::stringstream  m_ssMagic;       // dual threads use separate out streams

   uint64_t           m_MaxRcrsDpth;   // diag only
   uint64_t           m_Max_n;         // diag only
   uint64_t           m_TotalRecurse;  // diag only

   Clock_t::time_point  m_startMS;     // Derived req - progress info

   std::stringstream  m_ssDuration;

   std::ostream*      m_so;            // dual threads - std::cout or m_ssMagic

   uint64_t           m_blk;

   const int64_t      m_MaxThreadDuration; // single Max80KMs, dual Max40KMs


   // enum of known type (my preference for 'internal' constants)
   enum T431_Constraints : uint64_t
   {
      ZERO     = 0,
      B00      = 1,    // least-significant-bit is bit 0
      ONE      = 1,    // lsbit set
      TWO      = 2,
      THREE    = 3,
      //
      K        = 1000, // instead of 1024,
      //
      BlkSize  = ((K * K * K) / 2),  // 0.5 billion
      //
      MaxSingleCoreMS = (80 * K), // single thread max
      MaxDualCoreMS   = (40 * K)  // dual thread max
   };

   // convenience method:  show both hex and decimal on one line (see dtor)
   inline std::string showHexDec(const std::string& label, uint64_t val) {
      std::stringstream ss;
      ss << "\n" << label
         << " = 0x" << std::hex << std::setfill('0') << std::setw(16) << val
         << "  ("   << std::dec << std::setfill(' ') << std::setw(22) << val << ")";
      return (ss.str());
   }

   // duration: now() - m_startMS
   std::chrono::milliseconds  getChronoMillisecond() {
      return (std::chrono::duration_cast<std::chrono::milliseconds>
              (Clock_t::now() - m_startMS));
   }

   // reports milliseconds since m_startMS time
   void ssDurationPut (const std::string& label)
      {
         m_ssDuration.str(std::string());  // empty string
         m_ssDuration.clear();             // clear ss flags

         std::chrono::milliseconds  durationMS = getChronoMillisecond();

         uint64_t  durationSec = (durationMS.count() / 1000);
         uint64_t durationMSec = (durationMS.count() % 1000);

         m_ssDuration
            << label << std::dec << std::setfill(' ') << std::setw(3) << durationSec
            << "."   << std::dec << std::setfill('0') << std::setw(3) << durationMSec
            << " sec";
      }

   inline void reportProgress()
      {
         std::chrono::milliseconds  durationMS = getChronoMillisecond();

         assert(durationMS.count() < m_MaxThreadDuration); // normal finish -> "no endless loop"
         //
         uint64_t  durationSec = (durationMS.count() / 1000);
         uint64_t durationMSec = (durationMS.count() % 1000);

         *m_so << " m_blk = " << m_blk
               << "  m_N = 0x" << std::setfill('0') << std::hex << std::setw(16) << m_N
               << "  " << std::dec << std::setfill(' ') << std::setw(3) << durationSec
               << "."  << std::dec << std::setfill('0') << std::setw(3) << durationMSec
               << " sec  (" << std::dec << std::setfill(' ') << std::setw(10) << m_N << ")"
               << std::endl;
      }


public:

   T431_t(uint64_t maxDuration = MaxSingleCoreMS) :
      m_N                 (TWO),  // start val of current recursion
      m_upperLimitN       (initWrapAroundLimit()),
      // m_ssMagic        // default ctor ok - empty
      m_MaxRcrsDpth       (ZERO),
      m_Max_n             (TWO),
      m_TotalRecurse      (ZERO),
      m_startMS           (Clock_t::now()),
      // m_ssDuration     // default ctor ok - empty
      m_so                (&std::cout), // default
      m_blk               (ZERO),
      m_MaxThreadDuration (maxDuration)
      {
         m_ssMagic.str().reserve(1024*2);
         m_ssDuration.str().reserve(256);
      }

   ~T431_t()
      {
         // m_MaxThreadDuration
         // m_blk
         // m_so
         // m_ssDuration
         // m_startMS
         // m_Max_n
         // m_TotalRecurse
         // m_MaxRcrsDpth
         if (m_MaxRcrsDpth) // diag only
         {
            ssDurationPut ("\n      duration = " );

            std::cerr << "\n T431() dtor        "
               //
                      << showHexDec("         u64MAX",
                                    std::numeric_limits<uint64_t>::max()) << "\n"
               //
                      << showHexDec("        m_Max_n", m_Max_n)
                      << showHexDec("  (3*m_Max_n)+1", ((3*m_Max_n)+1))
                      << showHexDec("    upperLimitN", m_upperLimitN)
               //                        ^^^^^^^^^^^ no wrap-around possible when n is less
               //
                      << "\n"
                      << showHexDec("  m_MaxRcrsDpth",  m_MaxRcrsDpth)
                      << "\n"
                      << showHexDec(" m_TotalRecurse",  m_TotalRecurse)
               //
                      << m_ssDuration.str() << std::endl;
         }
         // m_ssMagic
         // m_upperLimitN
         // m_N
         if(m_MaxThreadDuration == MaxSingleCoreMS)
            std::cerr << "\n  " << __FILE__ << " by Douglas O. Moen     Compiled on = "
                      << __DATE__ << "  " << __TIME__ << "\n";
      }

private:

   inline bool checkMagicRE(uint64_t nE) // n is known even, and the recursion starting point
      {
         return(true);  // for unit test, comment out this line to run the asserts

         // unit test - enable both asserts
         assert(nE == m_N);             // confirm recursion start value
         assert(ZERO == (nE & B00));    // confirm even
         return(true);                  // nothing more to do
      }

   inline bool checkMagicRO(uint64_t nO) // n is known odd,  and the recursion starting point
      {
         // unit test - uncomment the following line to measure checkMagicRE() performance
         // return (true);  // when both methods do nothing -  0.044

         // unit test - enable both these asserts
         // assert(nO == m_N);  // confirm recursion start value
         // assert(nO & B00);   // confirm odd
         uint64_t nxt;
         {
            assert(nO < m_upperLimitN);  // assert that NO wrap-around occurs
            // NOTE: the sum of 4 odd uint's is even, and is destined to be
            // divided by 2 in the next recursive invocation.
            // we need not wait to divide by 2
            nxt = ((nO+nO+nO+ONE)  >> ONE); // combine the arithmetic
            //      ||||||||||||   ||||||
            //      sum is even    unknown after sum divided by two
         }
         return (checkMagicR(nxt));
      }


   inline void unwind8()
      {
         // skip 0, 1
         assert(checkMagicRE (m_N)); m_N++;  // 2
         assert(checkMagicRO (m_N)); m_N++;  // 3
         assert(checkMagicRE (m_N)); m_N++;  // 4
         assert(checkMagicRO (m_N)); m_N++;  // 5
         assert(checkMagicRE (m_N)); m_N++;  // 6
         assert(checkMagicRO (m_N)); m_N++;  // 7
         assert(checkMagicRE (m_N)); m_N++;  // 8
         assert(checkMagicRO (m_N)); m_N++;  // 9
      }

   inline void unwind10()
      {
         assert(checkMagicRE (m_N)); m_N++;  // 0
         assert(checkMagicRO (m_N)); m_N++;  // 1
         assert(checkMagicRE (m_N)); m_N++;  // 2
         assert(checkMagicRO (m_N)); m_N++;  // 3
         assert(checkMagicRE (m_N)); m_N++;  // 4
         assert(checkMagicRO (m_N)); m_N++;  // 5
         assert(checkMagicRE (m_N)); m_N++;  // 6
         assert(checkMagicRO (m_N)); m_N++;  // 7
         assert(checkMagicRE (m_N)); m_N++;  // 8
         assert(checkMagicRO (m_N)); m_N++;  // 9
      }

   inline void unwind98()
      {
         unwind8();
         unwind10();  // 1
         unwind10();  // 2
         unwind10();  // 3
         unwind10();  // 4
         unwind10();  // 5
         unwind10();  // 6
         unwind10();  // 7
         unwind10();  // 8
         unwind10();  // 9
      }

   inline void unwind100()
      {
         unwind10();  // 0
         unwind10();  // 1
         unwind10();  // 2
         unwind10();  // 3
         unwind10();  // 4
         unwind10();  // 5
         unwind10();  // 6
         unwind10();  // 7
         unwind10();  // 8
         unwind10();  // 9
      }

   inline void innerLoop (uint64_t start_N, uint64_t end_N)
      {
         for (uint64_t iLoop = start_N; iLoop < end_N; iLoop += 100)
         {
            unwind100();
         }

         reportProgress();
      }

   inline void outerLoop(const std::vector<uint64_t>&  start_Ns,
                         const std::vector<uint64_t>&  end_Ns)
      {
         *m_so << "\n outerLoop \n  m_upperLimitN = 0x"
               << std::hex << std::setfill('0') << std::setw(16) << m_upperLimitN
               << "\n            m_N = 0x"      << std::setw(16) << start_Ns[0]
               << "  " << std::dec << std::setfill(' ') << std::setw(3) << 0
               << "."  << std::dec << std::setfill('0') << std::setw(3) << 0
               << " sec  (" << std::dec << std::setfill(' ') << std::setw(10)
               << start_Ns[0] << ")" << std::endl;

         size_t szStart = start_Ns.size();
         size_t szEnd   = end_Ns.size();
         assert(szEnd == szStart);

         m_blk = 0;

         {  // when blk 0 starts at offset 2 (to skip values 0 and 1)
            if(TWO == start_Ns[0])
            {
               m_N = TWO;  // set m_N start

               unwind98(); // skip 0 and 1

               assert(100 == m_N); // confirm

               innerLoop (100, end_Ns[m_blk]);

               m_blk += 1;
            }
            else // thread 2 blk 0 starts in the middle of 5 billion range
            {
               m_N = start_Ns[m_blk];  // set m_N start, do full block
            }
         }

         for (; m_blk < szStart;  ++m_blk)
         {
            innerLoop (start_Ns[m_blk], end_Ns[m_blk]);
         }
      }


   // 1 == argc
   void  exec1 (void) // no parameters --> check all values
      {
         const std::vector<uint64_t> start_Ns =
            {         TWO, (BlkSize*1), (BlkSize*2), (BlkSize*3), (BlkSize*4),
              (BlkSize*5), (BlkSize*6), (BlkSize*7), (BlkSize*8), (BlkSize*9)  };
         //    blk 0        blk 1        blk 2        blk 3        blk 4
         //    blk 5        blk 6        blk 7        blk 8        blk 9
         const std::vector<uint64_t> end_Ns =
            { (BlkSize*1), (BlkSize*2), (BlkSize*3), (BlkSize*4), (BlkSize*5),
              (BlkSize*6), (BlkSize*7), (BlkSize*8), (BlkSize*9), (BlkSize*10) };

         m_startMS = Clock_t::now();

         // outerLoop :  10 blocks generates 10 progressReports()
         outerLoop( start_Ns, end_Ns);

         ssDurationPut("");

         std::cout << "   execDuration = " << m_ssDuration.str() << " " << std::endl;
      } // void  exec1 (void)


   void exec2a ()  //  thread entry a
      {
         //lower half of test range
         const std::vector<uint64_t> start_Ns =
            {        TWO , (BlkSize*1), (BlkSize*2), (BlkSize*3), (BlkSize*4) };
         //    blk 0        blk 1        blk 2        blk 3        blk 4
         const std::vector<uint64_t> end_Ns =
            { (BlkSize*1), (BlkSize*2), (BlkSize*3), (BlkSize*4), (BlkSize*5) };

         m_startMS = Clock_t::now();

         // m_so to std::cout
         // uses 5 blocks to gen 5 progress indicators
         outerLoop (start_Ns, end_Ns);

         ssDurationPut("");

         std::cout << "   execDuration =  " << m_ssDuration.str() << " (a)   " << std::endl;
      } // exec2a (void)


   void exec2b ()  //  thread entry b
      {
         // upper half of test range
         const std::vector<uint64_t> start_Ns =
            { (BlkSize*5), (BlkSize*6), (BlkSize*7), (BlkSize*8), (BlkSize*9)  };
         //    blk 5        blk 6        blk 7        blk 8        blk 9
         const std::vector<uint64_t> end_Ns =
            { (BlkSize*6), (BlkSize*7), (BlkSize*8), (BlkSize*9), (BlkSize*10) };

         m_startMS = Clock_t::now();

         m_so = &m_ssMagic;
         // uses 5 blocks to gen 5 progress indicators
         outerLoop(start_Ns, end_Ns);

         ssDurationPut("");

         m_ssMagic << "   execDuration =  " << m_ssDuration.str() << " (b)  " << std::endl;
      } // exec2b (void)


   // 3 == argc
   int exec3 (std::string argv1,  // recursion start value
              std::string argv2)  // values count
      {
         uint64_t start_N = std::stoull(argv1);
         uint64_t length  = std::stoull(argv2);

         // range checks
         {
            std::string note1;
            switch (start_N) // limit check
            {
            case 0: { start_N = 3; length -= 3;
                  note1 = "... 0 is below test range (change start to 3 ";
            } break;

            case 1: { start_N = 3; length -= 2;
                  note1 = "... 1 is defined magic (change start to 3";
            } break;

            case 2: { start_N = 3; length -= 1;
                  note1 = "... 2 is trivially magic (change start to 3";
            } break;

            default:
            {  // when start_N from user is even
               if (ZERO == (start_N & B00))
               {
                  start_N -= 1;
                  note1 = "... even is not an interesting starting point";
                  length  += 1;
               }
            }
            break;

            } // switch (start_N)

            // start_N not restrained ... allow any manual search
            //    but uin64_t wrap-around still preempted

            std::cout << "\n start_N: " << start_N << "  " << note1
                      << "\n  length: " << length  << "  " << std::endl;
         }

         m_startMS = Clock_t::now();

         for (m_N = start_N;   m_N < (start_N + length);   ++m_N)
         {
            bool magicNumberFound = checkMagicRD (m_N, 1);

            std::cout << m_ssMagic.str() << std::dec << m_N
                      << "    m_MaxRcrsDpth: "  << m_MaxRcrsDpth << ";"
                      << "    m_Max_n: "  << m_Max_n << ";" << std::endl;

            m_ssMagic.str(std::string()); // empty string
            m_ssMagic.clear();            // clear flags)
            if (!magicNumberFound)
            {
               std::cerr << m_N << " not a magic number!" << std::endl;
               break;
            }

         } // for (uint64_t k = kStart; k < kEnd; ++k)

         ssDurationPut("");

         std::cout << "   execDuration =  " << m_ssDuration.str() << "    " << std::endl;

         return(0);
      } // int exec3 (std::string argv1,  std::string argv2)


   // conversion
   std::string ssMagicShow() { return (m_ssMagic.str()); }


   // D3: use recursion for closer match to definition
   bool checkMagicR(uint64_t n)   // recursive Performance
      {
         uint64_t nxt;
         {
            if (n & B00) // n-odd
            {
               if (n >= m_upperLimitN) // wraparound imminent abort
                  return (false);    // recursion termination clause

               // NOTE: the sum of 4 odd uint's is even, and will be halved
               // we need not wait for the next recursion to divide-by-2
               nxt = ((n+n+n+ONE)  >> ONE); // combine the arithmetic
               //     known even
            }
            else // n-even
            {
               nxt = (n >> ONE);   // bit shift divide by 2

               if (nxt < m_N)      // nxt < m_N start of recursion
                  return ( true ); // recursion termination clause
               // We need not de-curse to 1 because all
               // (n < m_N) are asserted as magic
            }
         }
         return (checkMagicR(nxt));  // tail recursion
      } // bool  checkMagicR(uint64_t n)


   // D4: diagnostic text output
   bool checkMagicRD (uint64_t n, uint64_t rd)  // rd: recursion depth
      {
         if (n > m_Max_n) m_Max_n = n;
         m_TotalRecurse += 1;

         m_ssMagic << "\n" << rd << std::setw(static_cast<int>(rd)) << " "
                   << " checkMagicRD (" << m_N << ", " << n << ")";

         uint64_t nxt;
         {
            if (n & B00) // n-odd
            {
               if (n >= m_upperLimitN) // wraparound imminent abort
                  return ( terminateCheckMagic (nxt, rd, false) );

               // NOTE: the sum of 4 odd uint's is even, and will be divided by 2
               // we need not wait for the next recursion to divide by 2
               nxt = ((n+n+n+ONE)  >> ONE); // combine the arithmetic
               //      |||||||||   ||||||
               //    sum is even   unknown after divide by two
            }
            else // n-even
            {
               nxt = (n >> ONE);     // bit shift divide by 2

               if (nxt < m_N)      // nxt < m_N start of recursion
                  return ( terminateCheckMagic (nxt, rd, true) );
               // We need not de-curse to 1 because all
               // (n < m_N) are asserted as magic
            }
         }
         return (checkMagicRD (nxt, (rd+1))); // tail recursion
      } //  bool checkMagicRD(uint64_t, uint64_t rd)


   bool terminateCheckMagic (uint64_t n, uint64_t rd, bool rslt)
      {
         if (rd > m_MaxRcrsDpth) // diag only
         {
            std::chrono::milliseconds  durationMS = getChronoMillisecond();

            uint64_t durationSec  = durationMS.count() / 1000;
            uint64_t durationMSec = durationMS.count() % 1000;

            m_MaxRcrsDpth = rd;
            // generate noise each update - this does not happen often
            static uint64_t count = 0;
            std::cerr << "\n\n" << std::setfill(' ')
                      << std::setw(30) << "["
                      << std::setw(4) << durationSec << "." << std::setfill('0') << std::setw(3)
                      << durationMSec << std::setfill(' ')
                      << " sec]   m_N: " << std::setw(10) << m_N
                      << "    n: " << std::setw(10) << n
                      << "    MaxRcrsDpth: " << std::setw(5) << m_MaxRcrsDpth
                      << "    Max_n:  " << std::setw(16) << m_Max_n
                      << "   (" << count << ")" << std::flush;
            count += 1; // how many times MaxRcrsDpth updated
         }

         m_ssMagic << "\n " << std::setw(3) << rd << std::setw(static_cast<int>(rd)) << " "
                   << "  " << (rslt ? "True " : ">>>false<<< ") << m_N << ", ";

         return (rslt);
      }

   uint64_t initWrapAroundLimit()
      {
         uint64_t upLimit = 0;
         uint64_t u64MAX = std::numeric_limits<uint64_t>::max();
         // there exist a max number,  m_upperLimitN
         // where 3n+1 < u64MAX, i.e.
         upLimit = ((u64MAX - 1) / 3);

         return (upLimit);
      } // uint64_t initWrapAroundLimit()

public:

   int exec (int argc, char* argv[])
      {
         int retVal = -1;

         { time_t t0 = std::time(nullptr); while(t0 == time(nullptr)) { /* do nothing*/ }; }
         // result: consistent run time

         switch (argc)
         {

         case 1: // no parameters: 5 billion tests, 10 blocks, this instance, < 80 sec
         {
            exec1();
            retVal = 0;
         }
         break;

         case 2: // one parameter: 2 threads each runs 5/2 billion tests, in 5 blocks,
         {  //                     two new instances,  < 40 sec
            // 2 new objects
            T431_t  t431a(MaxDualCoreMS); // lower test sequence
            T431_t  t431b(MaxDualCoreMS); // upper test sequence

            // 2 threads
            std::thread tA (&T431_t::exec2a, &t431a);
            std::thread tB (&T431_t::exec2b, &t431b);

            // 2 join's
            tA.join();
            tB.join();

            // lower test sequence (tA) output directly to std::cout
            // upper test sequence (tB) captured in T431_t::m_ssMagic.  send it now
            std::cout << t431b.ssMagicShow() << std::endl;

            retVal = 0;
         }
         break;


         case 3: // argv[1] -> start address,  argv[2] -> length,
         {
            retVal = exec3 (std::string(argv[1]), std::string(argv[2]));
         } break;


         default :
         {
            std::cerr
               << "\nUsage: "
               << "\n argc (= " << argc << ") not expected, should be 0, 1, or 2 parameters."
               << "\n   1  (no parameters) : 5 billion tests, 10 blocks, < 80 sec\n"
               //
               << "\n   2  (one parameter) : 2 threads, each 5/2 billion tests, 5 blocks,  < 40 sec\n"
               //
               << "\n   3  (two parameters): verbose mode: start argv[1], test count argv[2] \n"
               //
               << "\n   3.1: \"./dumy431V4 3 1000  1> /tmp/dumy431V4.out2   2> /tmp/dumy431V4.out2a  "
               << "\n                     3  1 K    std::cout redirected     std::cerr redirected \n"
               //
               << "\n   3.2: \"./dumy431V4 3 1000000000  1>  /dev/null      2> /tmp/dumy431V4.out2b   "
               << "\n                     3  1 B       discard std::cout  std::cerr redirected \n"
               //
               << "\n   3.3: \"./dumy431V4 15 100   "
               << "\n                     15 100   (cout and cerr to console) \n"
               << std::endl;
            retVal = 0;
         }

         } // switch

         return(retVal);
      } // int exec(int argc, char* argv[])

};  // class T431


int main (int argc, char* argv[])
{
   std::cout << "argc: " << argc << std::endl;
   for (int i=0; i<argc; i+=1) std::cout << argv[i] << " ";
   std::cout << std::endl;

   setlocale(LC_ALL, "");
   std::ios::sync_with_stdio(false);

   int retVal;
   {
      T431_t  t431;
      retVal = t431.exec(argc, argv);
   }

   std::cout << "\nFINI " << std::endl;
   return(retVal);
}

The code submitted:

a) uses assert(x) for all checks (because assert is quick and has side-effects that the optimizer can not ignore).

b) detects any endless loop by using an assert of the duration.

c) asserts to prevent any wrap-around.

When any assert occurs, the program does not terminate normally.

When this code finishes normally, it means:

 a) no disproven numbers found, and
 b) the running time < 80 seconds, so no endless loop occurred      and
 c) no wrap-around occurred

Notes about Recursive checkMagicR(uint64_t n):

3 forms of n are available in the recursion:

  • formal parameter n - typical of recursion calls

  • auto var nxt - receives the computed value of n for next recursive call

  • data attribute: m_N - the starting point of the currently running recursion effort

Upvotes: 0

Ami Tavory
Ami Tavory

Reputation: 76297

This question basically asks to verify the Collatz Conjecture up to 5B.

I think the key here is, for each number n we are checking, to view the optimistic scenario and the pessimistic scenario, and to check the optimistic one before reverting to the pessimistic one.

In the optimistic scenario, as we we modify n according to the n / 2 ; 3n + 1 rule, the sequence of numbers will:

  • in a finite number of steps become smaller than n (in which case we can check what we know about that smaller number).

  • will not cause an overflow in the steps.

(on which, as TonyK correctly notes, we cannot rely (even not on the first)).

So, for the optimistic scenario, we can use the following function:

#include <unordered_set>
#include <set>
#include <iostream>
#include <memory>
#include <list>
#include <gmp.h>

using namespace std;

using found_set = unordered_set<size_t>;

bool fast_verify(size_t i, size_t max_tries, const found_set &found) {
    size_t tries = 0;
    size_t n = i;
    while(n != 1) {
        if(++tries == max_tries ) 
            return false;

        if(n < i)
            return found.empty() || found.find(i) == found.end();
        if(n % 2 == 0)
            n /= 2;
        else if(__builtin_mul_overflow (n, 3, &n) || __builtin_add_overflow(n, 1, &n))
            return false;
    }   

    return true;
}

Note the following:

  1. The function only attempts to verify the conjecture for the number it receives. If it returns true, it has been verified. If it returns false, it just means it hasn't been verified (i.e., it does not mean it has been disproved).

  2. It takes a parameter max_tries, and only verifies up to this number of steps. If the number has been exceeded, it does not try to discern whether this is part of an infinite loop or not - it just returns that verification failed.

  3. It takes an unordered_set of known numbers that have failed (of course, if the Collatz conjecture is true, this set will always be empty).

  4. It detects overflow via __builtin_*_overflow. (Unfortunately, this is gcc specific, then. A different platform might require a different set of such functions.)

Now for the slow-but-sure function. This function uses the GNU MP multi-precision arithmetic library. It checks for an infinite loop by maintaining the sequence of numbers it has already encountered. This function returns true if the conjecture has been proved for this number, and false if has been disproved for this number (note the difference from the previous fast verification).

bool slow_check(size_t i) {
    mpz_t n_; 
    mpz_init(n_);

    mpz_t rem_;
    mpz_init(rem_);

    mpz_t i_; 
    mpz_init(i_);

    mpz_set_ui(i_, i); 
    mpz_set_ui(n_, i); 

    list<mpz_t> seen;

    while(mpz_cmp_ui(n_, 1) != 0) {
        if(mpz_cmp_ui(n_, i) < 0)
            return true;
        mpz_mod_ui(rem_, n_, 2); 
        if(mpz_cmp_ui(rem_, 0) == 0) {
            mpz_div_ui(n_, n_, 2); 
        }   
        else {
            mpz_mul_ui(n_, n_, 3); 
            mpz_add_ui(n_, n_, 1); 
       }   
       seen.emplace_back(n_);
       for(const auto &e0: seen)
           for(const auto &e1: seen)
               if(&e0 != &e1 && mpz_cmp(e0, e1) == 0)
                   return false;
   }   

   return true;
}

Finally, main maintains an unordered_set of the disproven numbers. for each number, it optimistically tries to verify the conjecture, then, if it fails (for overflow or exceeding the number of iterations), uses the slow method:

int main()
{
    const auto max_num = 5000000000;
    found_set found;

    for(size_t i = 1; i <= max_num; i++) {
        if(i % 1000000000 == 0)
            cout << "iteration " << i << endl;

        auto const f = fast_verify(i, max_num, found);
        if(!f && !slow_check(i))
            found.insert(i);
    }

    for(auto e: found)
        cout << e << endl;
}

Running the full code (below) gives:

$ g++ -O3 --std=c++11 magic2.cpp -lgmp && time ./a.out
iteration 1000000000
iteration 2000000000
iteration 3000000000
iteration 4000000000
iteration 5000000000

real    1m3.933s
user    1m3.924s
sys 0m0.000s

$ uname -a
Linux atavory 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ sudo lshw | grep -i cpu
      *-cpu
           description: CPU
           product: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
           bus info: cpu@0
           version: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
           capabilities: x86-64 fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts cpufreq

I.e., no disproven numbers found, and the running time at ~64 seconds.


Full code:

#include <unordered_set>
#include <set>
#include <iostream>
#include <memory>
#include <list>
#include <gmp.h>

using namespace std;

using found_set = unordered_set<size_t>;

bool fast_verify(size_t i, size_t max_tries, const found_set &found) {
    size_t tries = 0;
    size_t n = i;
    while(n != 1) {
        if(++tries == max_tries )
            return false;

        if(n < i)
            return found.empty() || found.find(i) == found.end();
        if(n % 2 == 0)
            n /= 2;
        else if(__builtin_mul_overflow (n, 3, &n) || __builtin_add_overflow(n, 1, &n))
            return false;
    }   

    return true;
}

bool slow_check(size_t i) {
    mpz_t n_; 
    mpz_init(n_);

    mpz_t rem_;
    mpz_init(rem_);

    mpz_t i_; 
    mpz_init(i_);

    mpz_set_ui(i_, i); 
    mpz_set_ui(n_, i); 

    list<mpz_t> seen;

    while(mpz_cmp_ui(n_, 1) != 0) {
        if(mpz_cmp_ui(n_, i) < 0)
            return true;
        mpz_mod_ui(rem_, n_, 2); 
        if(mpz_cmp_ui(rem_, 0) == 0) {
            mpz_div_ui(n_, n_, 2); 
        }   
        else {
            mpz_mul_ui(n_, n_, 3); 
            mpz_add_ui(n_, n_, 1); 
       }   
       seen.emplace_back(n_);
       for(const auto &e0: seen)
           for(const auto &e1: seen)
               if(&e0 != &e1 && mpz_cmp(e0, e1) == 0)
                   return false;
   }   

   return true;
}


int main()
{
    const auto max_num = 5000000000;
    found_set found;

    for(size_t i = 1; i <= max_num; i++) {
        if(i % 1000000000 == 0)
            cout << "iteration " << i << endl;

        auto const f = fast_verify(i, max_num, found);
        if(!f && !slow_check(i))
            found.insert(i);
    }

    for(auto e: found)
        cout << e << endl;

    return 0;
}

Upvotes: 4

TonyK
TonyK

Reputation: 17114

Your code, and Ami Tavory's answer, completely ignore the issue of overflow. If there were a non-magic number in your range, then the Collatz iteration would either enter a loop, or increase without bound. If it increases without bound, it will certainly overflow, whatever the size of your unsigned long; and if it enters a loop, it might well overflow before it gets there. So you must put an overflow check in there:

#include <limits.h>
...
if (number > (ULONG_MAX - 1) / 3)
    PrintAnApologyAndDie() ;
number = (number * 3) + 1;

Upvotes: 2

Related Questions